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PREFACE 


A book is written for its readers, as the rumour goes. During the wri= 
ting the author begins to form an image of these unknown people. They 
are supposed to know everything which the author is not willing to 

explain. In the underlying case it means that they know what a group 


is and that they know about computers and parity checks. On the other 
hand,they are not supposed to know much about group theory, so that se= 
veral elementary concepts have to be explained. Chapter OQ contains a, | 


philosophically tinted, discussion, aimed at clarifying the mental 

attitude of the author towards the coding problems. It is thought to 
be understandable to almost everyone. Chapter 1, on the contrary, is 
more specialistic and gives some upper bounds for the size of error 
detecting codes. It it the most unfinished part, but as such it re= 
flects the state of art, since very often the limits are the last to 


be known to science. 


The reader who is only eager to know what codes are available to him, 
can safely skip chapter 1. For him chapter 2 offers a survey of what 
is available. For his convenience the new codes, which are constructed 
in the chapters 3, 4, and 5, are also reviewed, The mathematical level 
of this chapter is, in view of the supposed level of his mathematical 
erudition, kept as elementary as possible. Even if the group concept 
is occasionally mentioned, it is only meant for those who can appreci= 
ate it, whereas the others can ignore it without losing the thread of 
the argumentation. The application eager reader will probably stop 
here, since the next chapters deal primarily with the methodology of 
finding codes with prescribed properties. That is why in these chapters 
the mathematical level is higher in the sense that more complicated 
notations and arguments are used, In chapter 3 the possibilities for 
codes based on the addition modulo 10 are screened, whereas in chapter 
4 the analogous problem for codes based on the dihedral group of the 
order 10, is solved. Finally, chapter 5 deals with the so-called bie 
quinary codes, culminating in a new and quite remarkable one, 

For insiders, it may be pointed out that chapters 4 and 5 give pure 
decimal codes, which detect all transcriptioan errors and all trans= 


positions of adjacent symbols. This refutes the non=existence “proof” 


occurring in the literature. The author believes that the codes ex= 
plained in chapter 4 provide the first practical application of the 
dihedral group. This would illustrate the old saying that all beauti- 


ful mathematics will find an application, sooner or later. 


For the sake of completeness, the bibliography also refers to relevant 


literature not mentioned in the text. 


The author wishes to express his gratitude to the Delft University of 
Technology and the Mathematical Centre at Amsterdam, for putting their 
computer facilities at his disposal, and to the Amsterdam Municipal 
Clearing Office and the Netherlands Postal- Check and Giro Service, 
for confiding him in their error statistics and for their stimulation. 
His thanks also go to Mr. Á. Benard and Dr. A.D. Colenbrander for 
allowing him to disclose their codes. He is also much indebted to the 
professors Dr.ir. Á. van Wijngaarden and Dr. W. Peremans, who care= 


fully read the manuscript, for their many valuable suggestions. 


It is impossible to mention everybody who has stimulated or helped the 
prepration of this book, but an exception will be made for the staff 
of the Mathematical Centre, who has to be compltvented for the record 


speed with which the book was reproduced. 
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A general word like code, is one of the hard working words, in the sense 
of Humpty Dumpty. Originally code referred to a law, written or not. In 
eryptology the word code is used in contradistinction to a cipher. By 

a code a system of substitution is meant in which many words, phrases 

or syllables are replaced by code words or code numbers. The word cipher 
refers to a system in which the individual letters are worked upon. 

The commercial codes of the late twenties were used to cut down the 

costs of cablegrams. The military codes have secrecy as main purpose. 
Nowadays codes are widely used in the theory and practice of switching 
eireuits, eulminating in the design and use of computers. 

In this monograph coding is understood to be a mapping of an arbitrary 

set into a set of mathematical entities. The first set is often a set 

of tangible objects, persons or concepts, whereas the second set mostly 
consists of symbols or strings of symbols. The structural formulae of 
organic chemistry form an example of the application of other mathematical 
entities than strings of symbols. 

It is quite essential that the mapping is one to one, since the basic 

idea is, to use the abstract entities as names for the elements of the 
first set. In practical cases the main difficulty lies in the definition 
of the mapping. One can hardly attach the abstract entities to the objects 
or persons, be it that oniy the persons might object. This is, of course, 
the denotation problem, which is solved, more or less, by the use of tokens. 
Tokens are physical representations of symbols. For every symbol, there 

is a whole class of different tokens, which are commonly understood to 
stand for the same symbol. Examples are all types of “three's" in ali 
kinds of colours, print, written or spoken, including the less generally 
agreed upon way to represent a "3" in a computer. The borderlines of these 
classes are sometimes dangerously vague. The choice of most tokens, which 
was made historically, would nowadays be called a very poor job of system 
gesign, as everyone involved in character recognition will concede. But 

it is too iate for a change, ali trials to introduce a new alphabet will 


be utopic. The world will have to iive with the old one. Returning to 


the definition of the mapping, it will be cliear that one can attach to 
objects one or more tokens, as a label. These tokens represent the sym= 
bols on which the object has been mapped by the coding. Branding catie 
may be one form and engraving a serial number in guns or engines is 
another way of explicit labeling. The labeling may have a dual purpose, 
since it may be done in order to make identical objects different. On | 
the other hand, the branding of the cattle may be done to establish 

the ownership. Anyhow such an effective, but crude way to define a coding 
is impossible if the "objects" to be coded are concepts. For people 

the method may theoretically be feasible, but is hardly advisable, 
especially if these people are customers. The customary procedure is 

then to make some list, ín which series of tokens, representing the 

code, are linked with a verbal description of the coded objects. Such 

a list is called a code book or catalogue. Actually the situation is 
rather tricky, since it might be said, that such a description itself 

is ( a notation for) a code. Hence the question would remain how to 
define the latter code. Since a verbai description seldom really charac= 
terigzes the object, it may be questioned whether such a description is 

a code. This does not make the situation any better. In fact, the descript= 
ions use as a rule, all kinds of contexts, written or not, to help define 
the objects. Often it is supposed to be ciear that the object is one 

of a known (how?) class. This type of probiems is of course inherent 

to all (succesful) communication. Strictly speaking, communication is 
essentially impossible, but it sometimes works. It is mereiy a matter 

of success and efficiency how far one has to go with refining the des- 
criptions. Parenthetically it may be remarked that the characterization 
of persons by fingerprints or sets of measures, may be very practical, 
but theoretically the system is never foolproof, and that not alone 
because of the fingerless people. There is a difficulty for every solution. 
Summarizing the one to one mapping of an arbitrary set into a set of 
mathematical entities is called coding. The second set is then called 

a code. Notation is a physical representation of the second set by 

means of tokens. These tokens fall apart into classes of equivalent 


ones, each representing the same mathematical symbol. The equivalence 


is based on a common understanding and is as such a potential source of 
confusion. To help to avoid or at least detect this confusion is the aim 
of the following chapters. 
An important point is that the common understanding of the tokens is 
some kind of social phenomena. It is as such amenable to a study, which 
will of course be of a statistical nature. It is conceivable to measure 
the degree of intersubjectivity by controlled experiments. One could let 
may people write a "3" and one could then measure how often, other or 
the same, people recognize it correctly. It might turn out that a "3" is 
a better token than say a "5", It is rather difficult, if not impossible, 
to get unbiased information on this recognition problem. The known error 
statistics on codes show a certain onesidedness (see 36) in the sense 
nett TRT 


that, e.g., a “q" can be easily mistaken for a “g', but seldoma g 


becomes a “q'. This may very well be due to the relative frequencies 


of use of the various tokens. 





The use of the decimals is rooted in tradition. Unlike the binary codes, 
there is no intrinsic reason for its use. It is just because people are 
used to it, that the decimal system is so important. Ít is therefore not 
surprising, that the decimal codes are mostly handled, at least partly 
by human beings. The same holds in a way for alphabetic codes. For 
mnemotechnic reasons, it was believed in the past, that codes for human 
use, should be of the alphabetic or alpha=numeric type. Car iicence 
numbers and telephone numbers in various countries are relics of this 
befief. In the present time where the human use and the machine handling 
gets mostly combined, the decimal codes are getting more popular. An 
other reason may be that recent studies indicate that the alphabetic 
characters are more error prone than the decimals (see 36}. 

Ás said before, the code words are intended to be used as names for the 
things for which they stand. Â name is needed if a reference is toa be 
made to something. Such a reference will be called a mutation. Âs a rule, 
the mutations are part of a process, say an administrative one. For the 
process the code words serve as input. The primary reason for the use 

of a code, rather than the naturali language, is the efficiency. The fact 


that a code ís unambiguous, is not a good argument, since that effect 


could also be obtained by properly extending the natural names or descript= 
ions, so that here again the efficiency is the basic motive In data 
processing systems it is customary that the various inputs are unrelated 
and come from many origins. Thus the mutations converge into the system. 
The coding is often done ín the periphery, by the users or customers and 
is therefore largely outside the control of the system. As a consequence 
the system has to cope with the errors made. The redundant codes, which 
will be dealt with in the next section, serve as a defense of the system 
against these errors. To be sure, some of the errors are caused by human 
operators incorporated in the system, for the preparation of the machine 
readable records. This operation ís of course under the systems responsa 
bility, and error prevention shouid be practiced anyhow. The redundant 
coding ís in fact a burden for this preparatory operation which causes 
some authors to reject redundant coding altogether (see 6). But there 

is a tendency to push the preparation of the machine readable record back 
to the user. Optical readers, diais and on-line input stations are some 
of the means to that end. This self service eliminates the bottleneck 

of the punching and the like and as a rule deminishes the waiting time 
since batch forming may be avoided. This greatly widens the appiicability 
range of the modern data processing systems. It also will, in the 

opinion of the author, make the use of redundant codes more urgent. In 
terms of the information theory it can be said, that a large number of 
channels converge into the system. The letters of the alphabet used 

in each channel, are the words of the code. The alphabets tend to be 
very large and the rate by which the letters are generated per channei 
will be very low. The channels are not noiseless. The noise is mostiy 
caused by human factors. Mathematically, the noise is defined by the 
transition probabilities p(x,y), where p(x,y) is the chance that the 

code word x is received by the system as y. The nature of this noise 


will be the subject of section 0.4. 


Á code is called redundant as soon as the mapping of the coded set 
does not cover the code. This redundancy can be more or less accidental, 


because the code happens to have more words than necessary for the set 


to be coded. In most applications, this will be the case, since popu” 
lations, customers, inventories etc. do not tend to come in powers of 
10, like the decimal codes do. The redundancy can also occur intention= 
ally and sometimes temporarily, when the code is chosen purposely too 
large for a growing stock or population. Though the control of this 
natural redundancy is worthwhile, the main topic of the following 
chapters will be that of the artificial redundancy. The latter form of 
redundancy is obtained by admitting only a subset of the code for use. 
Strictly speaking only the admitted subset itself is the code. The 
words outside this subset are sometimes called improper or forbidden 
code words. This terminology has the same inconsistency, not at all 
unusual in mathematics, which adorns expressions like the burnt down 
house. 

The code is some sort of intermediary between the real thing and the 
denotation. It is therefore typical that the mathematical properties 

of the code are sometimes desirable for the sake of the coded objects 
and sometimes for the sake of the notation. Hierarchical codes, like 
the U.D.C. are examples of the first kind. The teletype code examplifies 
the second kind. It is a 5 dimensional binary code since the teletype 
uses 5 channels. The physical representation, with the 2 states, hole 
or no hole, is of course a notation. 

As will be seen later on in the section on errors, it is advantageous 
to introduce a topology or metric in the code, just to be abie to 
describe the errors which result from the deficiencies of the notation. 
These errors provide the criteria for the selection of the subset which 
is to form the redundant code. Apart from their use in the struggle 
against errors the codes are of interest as a mathematical object of 
study. 

The redundancy can be measured as follows. let U be the set of potential 
code words and let C be the selected subset containing the proper code 
words. The fraction 1-|c|/|uvl is a measure for the redundancy. The 
parity check would thus yield a code with a redundancy of 50%. By taking 
the base 2 logarithm the redundancy is measured in bits. The parity 
check has of course a redundancy of 1 bit, If the code words consist 


of meary digits, then the base m logarithm gives the redundancy in 


in mrary digits. 





As stressed before, the main objective of coding is the increase of 
efficiency in handling the coded data. This holds for human processes 
as well as for automatic processes. Especially for the latter type | 
it is important that the codes lend themselves for standardized nota 
tions. This is merely another aspect of the efficiency, but it shows 
again that the codes are not made for the peculiarities of the coded 
objects alone. In this machine age, people sometimes have to adapt 
themselves to the machine. The reason is, may be, that the machines 
do such a tremendous amount of data handling, that the pay-off from 
the efficiency in the machine part is more important. This may very 
weli change when the cost per operation goes further down. 

The drawback of the increased efficiency is that errors tend to be 
more dangerous : an error in a natural name, does not always produce 
another name, but a number is always changed into another number. One 
might also say that the numbers are all alike or that the names satisfy 
certain syntactical or even semantic rules. As not all le:ter combi- 
nations are used as names, it may be said that the names are highiy 
redundant. In fact the set of names forms a, perhaps ili defined, 
redundant code. As soon as there is redundancy in a code there is 

a chance that an erroneous code word does not correspond with an 
object. Let À be a set of coded objects and let C be a selected subset 
of the set U of code words. Let c be a mapping of A in C, hence far 
all aeÂ it holds that cla)jeC and c(A)&C. If xecla) with ac A, and 


if an error changes x into y,‚, then there are three possibilities: 
1) y&eclA); di) y felA) and yeC and iii) yÉ C. 


In the first case the error is fatal in the sense that a false muta 
tion may be made. In practice it will be often possible to detect the 
error from the context, written or not, Ïike in the case that Granny 
was drafted for the commandoes. Ás a matter of fact that is how one 
knows about the errors anyhow, for as a rule there is a feed-back 
into the system from complaining customers or victims. Sometimes 


however the complaints are too late to undo the fatal consequences. 


This first case is obviously very undesirabie. 

The second case is less harmful, since it cannot result into a wrong 
mutation. The code word y simply does not correspond to an object. It 
may cause some nuisance, since as a ruie it will be detected during the 
processing. This may be by a mailman looking for a non-existent house. 
The automatic detection of these unused code words is under certain | 
conditions possible. A simple example is a code of which it is known 
that only the first n words are used. It ís of course necessary that 
the code words are ordered, say lexicographicaily. In the Dutch 
population registration number a more sophisticated method has been 
applied. 

The third case is the most important one from a theoretical point of 
view. The error may in that case be detected without any knowledge of 
the use of the code. Especially if C is defined by an algorithm, it is 
possible to detect the error automatically. It shouid be noted that 

the second class of errors can always be converted into the third class 
by the application of a table look=up procedure. The art of making 
error detecting codes consists of two things; the first one is to select 
the set C in such a way that the most likely errors always belong to 
the third class and the second one is to define such a set C by means 
of a simple criterien, which lends itself to an easy technical imple- 
mentation. The latter requirement is a matter of economy and as soon 

as the table look-up procedure is feasible,the requirement looses its 
importance. In general it is true that, when the memories get cheaper, 
algorithms can be (economicaliy) replaced by table look=up. All these 
technical considerations are very much dependent on the state of techno= 
logy. When the computers get better at parallel processing, the aïgo= 
rithm might again be more economic. 

There is a tendency nowadays to adapt the machine to the human being, 
rather than the other way around. High level programming languages are 
also evidence of that tendency. When the computers are learning the 
naturali language, the coding probiems will be change, but not disappear. 
To be in vogue, the question of optimal error detecting codes should 
be considered, It wouid have to be a code, which detects more errors 


than any other code with the same redundancy. This property would 


clearly be dependent on the frequency of use of the various code words 
and of the distribution of the errors. The optimalization problem is 

not very meaningful, for even if the costs of detected and undetected 
errors were known, they are bound to change in the course of time. More- 
over, the distribution of the errors is usually unknown at the moment 
that the code has to be chosen. Furthermore, these statistical qualities 
may be expected to change during the existence of the system. Finally, 
though pure mathematicaliy speaking there is no problem at all, since 
there is “only” a finite number of possibilities, from a practical point 
of view the problem is probably just as practical as the differential 


equations governing the universe. A situation like this provides for 


the mathematician a rich hunting ground for nice probiems (see chapters 


3 and 4}. 





The efficiency of a code depends on the frequency with which the code 
words occur. Let p denote this frequency distribution. It is well known 
from information theory, that it is always possible to encode a source 


with entropy =) plxdiglplx}}, so that the average length of the code 
xeC 


words is equal to the entropy. Unfortunately, this theorem is rather 
sterile in cases where the coding lies outside the control of the system. 
Ít is all right án simple cases like the following one. let a code consist 
of 4 words with mutation frequencies of 1/2; 1/4; 1/8 and 1/8 respectively. 
EE B laties bje7/4. 
Encoding the words with O, 10, 110 and 1Ìi respectively, gives exactiy 


The entropy is in that case (2 zate 188 
the average of 175 bit per 100 mutations. If, however,the frequencies 

are not so civilized, then the proof of the theorem hinges on the trick of 
making pairs. The pairs of words form a larger set, within a “more 
uniform" statistical distribution of use. E.g., let A,B,C, and D be 4 
words with a mutation frequency of 60%; 30%; 5% and 5% respectively. 

The above mentioned code would score an average of 1.5 bit per mutation 


whereas the entropy is roughiy Í.4. Coding the pairs as follows gives 


an average of Í.43 bit per word, 


ÂAz0 AC=11011 BC=110101 CC=110100110 
AB=100 CA=11110 CB=111010 CD=11G100111 
BAzlOi AD=iiiij BD=1 11011 DC=110100100 
BB=1100 DÁ=ii100 DB=i 101000 DD=110100101 


Now in data processing systems in which the code words are generated 
independently at various locations, before they are channeled into the 
system, only the first approximation seems to be possible. This is not 
quite true, since one could do the pairing at the various sources, but 
it would require a code book of all the pairs. Now just imagine a bank 
publishing the book of the code numbers of all the pairs of account 
hoiders, not to speak of the letter headings of the customers. 

The codes mentioned above are intended for use in a binary channel, 
where all the words are linked together and thus have to be separabie 
afterwards. If there is a natural separation between the different 
words, it is possible to obtain a bigger gain in efficiency. Consider 
the second example again. Now the first approximation may be taken as 
A0; Bal; C+00; D>01. It would give an average of 1.1 bit per mu= 
tation. The second approximation, given below, would require only 0.8 


bit as an average. 


AA > OQ AC > 10 BC > 010 CC > 110 
AB > ji CB > 1 CB > OÌ1 CD > iÌÌ 
BA > 00 AD > 000 BD > 100 DC > 0000 
BB > Ol DA > OOL DB > 101 DD > 0OO1 


Tt should be noted that even in the case of a uniform distribution, this 
type of coding would give a gain in efficiency. 

Alt these considerations are probably of little practical value, since 
the distribution of use will only become available after the code has 
been used for some time. The cost of recoding will mostly outweigh the 
possible gains. In practice the distribution can be extremely skew, like 
in banking operations where often less than 1% of the accounts draw more 
than 50% of the mutations. The short code numbers are however often more 
correlated with the oïd ciients rather nan with the mutation getters. 
In population register systems the distribution will probably be closer 


to uniform. 
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The double distribution f(x,y} contains a host of information, which is 
usually not available. This is certainly so during the design stage of 
the system. Later on, when the system is operating, that information will 
become available. It is then still useful in the struggle against the 
errors. Suppose that in the administrative system of a bank, a certain 
error probability f(a,b)} gets high, then this may be an indication of 

a systematic error, which ies possibly outside the system. It might be 
a misprint in somebodies account number, as indicated on his bills. 
These kinds of errors are only of local interest. In general however 

one will be interested in deducing principles, like the law that most 
errors are in one digit only. The knowledge of the double distribution 
is therefore more qualitative than quantitative. But, the vaguer the 
knowledge the broader the applicability. 

The error sampies,as found in existing systems, will be biassed if the 
distribution of the mutations is not uniform, i.e., virtualiy always. 
The following error types have been observed both in the literature on 
the subject (see 2, 26, 28) and from samples put available to the author 
by the courtesy of the Dutch Postal Clearing House and the Cliearing House 
of the Amsterdam Municipaiity. 

1) The single errors, also called transcription errors: These errors 
affect only one digit of the code word. It is by far the largest class 
of errors in all known cases. Its frequency ranges from GO to 95%. 
Little is known about the distribution within this class. Both clearing 
house samples suggest that the right hand side of the code word is more 
vulnerable for errors. This might be caused by the fact that the numbers 
are written and punched from the left to the right, so that the right 
most digits have to be memorized longer in the short term memory of the 
writing or punching being. The number systems in which this was observed 
are of the non=fixed length type. This implies that the last position 
is never void contrary to the other positions. The transition probabili= 
ties of the decimals are indeed depending on the decimals, but there 

are few very low ones. (See the tables at of the next section). 


W. Uirich (49) introduced the concept of the restricted single error, 


1á 


which is a single error with the restriction that the difference between 
the correct and the incorrect digit is one unit. This concept gives rise 
to elegant generalizations of known binary codes. It is conceivable that 
certain technical implementations of calculators, like the ones using 
pulse trains to represent the digits, lead to this error type. But 

there is no evidence in the error statistics that this type is of special 


interest. 


2} The double errors. These errors affect two digits. The frequency 





ranges from ÌO to 20%. The vast majority concerns adjacent digits, Î.e., 
digits with adjacent positions. This is of course an indication that 


the two errors are not independent. The double errors are subdivided 


into: 





form ab-rba. This error type is called a transposition. It will always 
be understood that the digits are adjacent if ín the following chapters 
the term transposition is used. 

The transpositions are a notorious error type of a typical human nature. 
The qualities of an error=detecting code are often judged according to 
its detecting capacity in this very class of errors, assuming of course 
that the single errors are detected anyhow. Mathematicaliy, it turned 
out that the decimal codes were especially difficult in this respect. 
Some authors thought that they proved that decimal codes detecting all 
single errors as well as all transpositions, were non-existent. (see 46, 39). 
These ‘“proofs" came fortunately after that the present author had con= 
structed such a code. (see 51). 

There are several minor classes of double errors, which are important 
since codes detecting all single errors and all transpositions may be 
completely immune for these classes. Their frequency is small, say 

0.5 to 1.5% of ali errors. 


These classes are: 





LE en PE 


can easily be explained, for in case one "a!" is misread as a “b', the 


other one is likely to be misread too. Also if one is gunching blindly, 


VT en, PE 


it is logical that, if a finger is on the wrong key for the first a, 


that then the second one will be treated or rather mistreated in the 


12 


same way. 





of 2 digits, jumping over a third one, like abcescha. It could also be 
called a reversion, since the order of the 3 digits gets reversed. Its 


psychological explanation can perhaps be sought in an auditive echo. 





samples reveal that in the adjacent double errors, the errors of the 

type ab >ca or vice versa, occur much more than chance predicts. Among 
these the errors with b=0 and c=t are again much more numerous than 
expected. These errors are called phonetic. They might be explained by 
the phonetic resemblance when the pairs aQ and Îa are pronounced. This 

is of course dependent on the Banage. but it holds in English, Dutch 
and German. This explanation is strenthened by the fact that the errors 

| 12 +20 and vice versa are indeed much less frequent. It would be inte= 
resting to know how this ís in the French speaking countries. It is also 
an open question whether an oral communication link is needed or whether 
punch typists with an auditive memory can be responsible for this error 
type. 

2.5) The jump twin errors. These errors are of the form aba-rcbc. Their 
frequency is, as is to be expected, lower than that of the twin errors, 
say 50%. Their explanation may be the same. The frequency of more remote 
twin errors, like a,.a>c..c is very much lower. This is also the case with 
the interchange of digits over more than Ì digit. 

word. The frequency lies somewhere between the 10 and 20%. The vast 
majority consists of the omission of one digit, where the last position 
again seems to be the most vulnerable one. It is also striking that the 
O is the decimal which is most easily dropped. However, since there seems 
to be a tendency to allocate "beautiful! numbers, ending with one or more 
zero's, to important customers, like tax collectors, there definitely is 
a bias if the statistics are drawn from banking and clearing systems. It 
is also remarkable and equally dubious, that the forgotten digits often 


were members of a sequence of identical ones. It is an illustration of 


the ancient theorem that beauty is dangerous. 





since it consists of those errors for which there is apparently no re= 


lation between the correct number and the erroneous one. It is also 
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believed that all numbers are equally susceptible for this error type. 
The random errors are both pleasant and nasty. They are pleasant, be- 
cause all redundancy helps to detect them. They are nasty, since it is 
impossible to design a code which would do any better than any other 
code. There is also a very important assumption made, hidden in the word 
“apparently!'. For there may very well be no relation between the code | 
numbers as such, but there can be a hidden semantic relation. E.g., both 
numbers can belong to the same person, one being his account number and 
the other his telephone number. It also can happen that the two numbers 
are adjacent in some code book. It is difficult to trace that kind of 
errors down without employing a full time detective. All this would not 
be so serious as long as these '"semantic" errors behave like random 
errors, but there is every reason to believe that this sort of error will 
prove to be immune for all detection systems. If somebody is copying the 
wrong number correctly, he will do so too if the number is one of an 
error detecting code. These immune errors may turn out to be one of the 
criteria for how far one has to go in the imporvement of the detection 
capacity of a code. Suppose that a certain system has to cope with 

100 errors a day, 50 of which being immune. Now one might be interested 
to cut this down to 55 undetected errors, the same 50 immune ones included 
at the cost of one more check digit. However to cut this down to 50,59 at 
double cost might be unattractive. The immune errors form some kind of 
basic noise level. 

The total frequency of the random errors varies considerabiy, depending 
on the nature of the system. if the code numbers are more or less used 
publicly this class might be much bigger than for those systems where 
the numbers are more privately used, like passport numbers etc... For the 
clearing house systems 5 to 15% has been measured. The percentage of the 
immune errors, though more important, is unknown. 

5} Finally, there remains the traditional class called miscellaneous. It 
contains collector items like; aba bab; abed-> cdab; aaaa bbbb. All are 
rare and mostly difficult to detect for 100%. Occasionally some defy 
detection, so that in studies of the undetected errors of a certain code, 


these rare errors might seem significant. 
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The main sample of errors, available to the author, concerns errors made 
in a non=fixed length code. The total size of the sample is 22733 pairs 
(consisting of a good code word and an erroneous one). The sampie is 


divided according to the length of both numbers, as follows: 


Both numbers 7 digits 8 
dl à 6 en 12112 
a 5 ik 3333 
a ú 4 ki 1774 
ds à 3 ij 139 
ii ke On 25 
Unequali tength 5342 


There were 2343 cases of one forgotten digit. 
The anaïysis of the largest group will be given here as an illustration. 
The distribution according to the number of places on which the words 


of each pair differ is: 


Ì place 9574 or 78.9% 
2 places 1870 or 15.6% 
3 places 169 or 1.4% 
4 places 118 or 1.0% 
3 places 219 or 1.8% 
4 places 162 or 1.3% 


12112 


A further analysis of the single errors reveals that the rightmost 
digit is affected most frequently. The distribution according to the 


position of the error, counting from the right, is: 


position À 2854 
position 2 : 2296 
position 3 : 1270 
position 4 929 
position 5 : 1503 
position 6 122 


9574 
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The following matrix gives the transition frequencies of the ten deci- 


mals. The 125 in row 4 and column 6 means that 125 times a "4 became 











| 749 | 960 |1308 811 [1075 |1018 


It would be highly interesting to know which properties, of this matrix, 


are independent of the system from which the errors are drawn. 

The restricted single errors total 2923, which is higher than the expect= 
ed 2/S=th of 9574, The digit "3" seems to be the black sheep of the de= 
cimals. | 

The double errors have also been subjected to a further analysis. From 

a technical (and probably also from a psychological) point of view it 

is interesting to know whether the double errors tend to come in bursts. 
The following distributions according to the distance of the errors in 
the words, has been found. 


Distance Ì (adjacent positions) 1595 


ii 2 (x.x, jump errors) 177 
3 (x..x) 71 
ei 4 (x...x) 18 
k Be Arde) 4 

1870 


This statistic strongly suggests that the errors are dependent. The 1598 


burst errors are subdivided into: 
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Transpositions 1237 
Twin errors 67 
Phonetic errors 59 
Rest 232 

1595. 


The 177 jump errors are divided into: 


Jump transpositions 99 
Jump twin errors 35 
Rest 43 

177 


The distribution of the phonetic errors according to their position in 


1 
the code word is as follows: ran i ei 


The absence of the phonetic errors on the odd positions may be explained by 
the habit of quoting the words in pairs of decimais. The distribution 


of the errors ÌÎx >xO and xO >ix, 


over XxX is! 





It is typical that 8 has such a low frequency, because in the Dutch 
language 80 is "tachtig" but 18 is "achttien" in contradistinction 
with the English and German which are consistent with "eighty” and 
Veightteen!" and “achtzig" and "achtzehn" respectively. 

The multiple errors are mostly errors of the random type and as such 
they defy analysis. 


0.7. Detection versus prevention. 





No matter how good the error detecting capacity of a check system is, 

one will still be interested in minimizing the number of errors. The 
available measures, which belong mainly to the realm of human engineering 
fall outside the scope of this monograph. There is however also a mathe= 
matical approach to the probiem of error prevention. This approach is 
based on the non=uniformity of the distribution of the errors over the 
code words. By selecting a code C in such a way that the overall error 


chance is minimal, a certain error prevention is achieved. The more 
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error prone code words are excluded. In the Dutch population register 
system those code words, having equal decimal digits on adjacent po= 
sitions, are avoided, since these code words are considered to be more 
error prone than the other ones. 

The available statistics are however still insufficient to tackle this 
problem effectively. 

Another virtually unknown factor is the increase of the error chance 
because of the added check digit. It is obvious that a high redundancy 
may very well successfully lower the percentage of the undetected 
errors, but it will also tower the percentage of correct code words. 
The detection becomes, if the redundancy increases, in a certain sense, 
less effective. The reason is, that how longer the code word is, the 
less information the detection of an error provides. So will it be a 
small surprise to learn that a certain book contains an error. 

The ultimate goal of detection is of course a correction. This can 
often only be done by feedback towards the source of the error. Ín 
systems with a decentralized input and a parallel processing, it is a 
customary procedure to reject the erroneous inputs, so that the rest 
can be processed. If this rest is not the bulk of the workload, of if 
the system is processing serially, it becomes desirable to have an 
on-line correction. The problem arises to construct codes with the 
property that such a correction, which can never be infallible, is 


at least most likely. 


0.8. 





An error correcting code is a redundant code C, along with a decision 
scheme which associates with certain inproper code words a proper one, 
which is called the corrected code word. This association can in prin= 
ciple be done quite arbitrarily, but it is natural to do it in such a 
way that each code word is imbedded in a set of wards which can be ob- 
tained by making an error, of a certain type, in said code word. Ïf 
the code is such that these sets are mutualiy disjoint, then an error 
of that type can be corrected by the convention that if a word of such 
a set is received then the only proper code word of that set is taken 
as the corrected word. One could also say that in such a case the 


coding is not unique, since to each object a whole set of code words is 
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allocated. The code, though no longer unigue, has still to be unam= 
bigucus and this is so as soon as the associated sets are mutually 
disjoint. Hence to each xe C there bekomngs a set Alx) , with the pro= 
perty that from x 4 y it follows that Alx})mA(y) = O. Let V be the 
union of all A(x), thus V AG). If VaU then the code is said to 

be perfect or close packed. In V there is an equivalence defined by 
the classes Â (x) and each word of V is equivalent with just one word 
of C. This defined a mapping ® of V on C. The correction procedure 
corrects each word w of V into ò(w). If a word outside V is received 
then the error is detected, but cannot be corrected. This cannot occur 
if Vel, i.e. if the code is close packed. The term perfect is less 
appropriate, since it is in a way not the code which is perfect but 
the correcting scheme because it corrects every error. This property 
may be desirable for the applications in the serial processes, but not 
for the systems with parallel processing where the correction is only 
needed to secure that the bulk of the input can be processed. In order 
to appreciate this point it should be noted that an error correcting code 
only guarantees the correction of a certain type of errors. In real 
life however also errors of other types are bound to occur A perfect 
correcting scheme will "correct" these errors by introducing an error 
of the protected type. It may therefore be a good policy toe choose V 
deïiberately so small that certain errors will never be "corrected", 
The code of the Dutch population register system is a single eror 
correcting code which does not “correct! the transpositions. 

The type of random errors is the stumbling block, since a random 
error is never guaranteed to fail outside V. In fact a random error 
correcting code C has only one code word, since Alx)=U for all x. This 


trivial code is always perfect. 





Let C again be a redundant code in a space U, It is often possibe to 
find one or more codes C' with the same detecting capabilities as C, but 
disjoint with C. As will be seen later on, decimal codes defined by a 
check equation will split up the space U inte 10 mutually disjunct 

codes (, or in general k,if one is working modulo k). In section O7 


it was pointed out that, though these codes are equivalent detection= 


19 


wise, they may be different as to the overall error chance. There is 
another way in which these disjoint codes may be useful for the appli= 
cations. Suppose that two operating systems, perhaps sharing many 
customers, need an error protection for their codes. By adding a check 
digit to the existing code words much of the cost of recoding can be 
avoided. If these systems draw the check digit from disjoint codes they 
have the additional avantage. Mat each valid number of one system 

is invalid for the other system. This might eliminate a source of seeming= 
Iy random errors. 

Another application might be a group of branch offices of a large bank 
with a central administration. If disjoint codes are used for the 
clients of the various branch offices, then one would have a protected 
code without using more digits. The traditional solution would use ther 
first digit to designate the branch office without giving any error 
detection possibility within the local administration. It is an ele= 
gant way of setting the redundancy at work. In cases with more than 10 


subsystems a higher modulus check might be useful (see section 2.3). 


0.10. Better error detection by _ 





In section 0.3 it was argued that the natural redundancy, which is 
usually present since the codes are seldom used to full capacity, may 
lead to error detection during the processing. It would be an advan= 
tage if this detection could be done during the input stage. This can 
be achieved by a controled use of the code. It is not uncommon to use 
only the first interval, under lexicographical ordering, of the code. 
Suppose that some system with 600000 customers uses a code with J-digit 
decimal code words. Using only the first 600000 numbers guarantees that 
an error which yields a higher number is detected at the input if the 
proper measures are taken. The protection procured in this way is how= 
ever primarily aimed at the first (least vulnerable) decimal. Much 
better in this respect is the pseudo random use of the code, which can 


be accomplished in the following manner, With the aid of a reversibie 
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deciphering one can shuffle the code words and by using the first 

( in example above, 600000) numbers, the used part of the code is 
(pseudo) randomly distributed over the code. Now a code word received 

at the systems input can be reshuffled and if it does not belong to the 
first 600000 an error is detected. In the code of the Dutch population 
register system this feature is incorporated. The reversabie deciphering 
is done with a feedback shiftregister, working in the field of the com- 
plex ternary numbers, i.e. the complex numbers of the form atbi with 


a,be {0,1,2} . 
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In this chapter some concepts are introduced to facilitate the dis= 


cussion of redundant codes. 


An error type is essentially a mapping of a set of (potential) code 
words into the class of its subsets. To each ag U there corresponds 
a set E(a)gsU of all those words which can be derived from a by an 
error of the given type. The set Ela) may be empty. In the case of 
the random errors each ac U is mapped on the set U=a. A code C is 
called E-proof if Ela)n C=O for all aeC. An E-proof code C admitts 
a correcting scheme for the error-type E if Ela)nE(b) =O for all 
a,b € C, with a {Á b. It is then called an error correcting code. If 
moreover Ue Elaj=U-C then the correcting code is called perfect or 
close-=packed. An E-proof code C is called maximal if there does not 
exist an E-proof code C° which properly contains C. If such is the case 
it follows that E(b)mnC4O for each béC. Schauffler (43) calls such a code 
closed (abgeschlossen) with respect to E. 
An E-proof code C is called optimal if there does not exist an E-proof 
code in U with more words than C. Án optimal code is necessariiy 
maximal. Â code C is said to be p% E-proof if 
p/100o= J [ENE a/ YJ |z|. 

ae C a eC 
The redundancy of a code C in an mary space U is defined as 
ig, C{ul/lel) aigits or 1e,Cfol/fc{) bits. 
An error=type E is called symmetric if from acE(b) it follows that 
beE(a). Most of the error-types mentioned in the introduction are 
symmetric. The type of the forgotten digits is an exception. 
For symmetric error-types a metric can be defined. The distance between 
a and b is the minimal number of errors (of the given type) which have 
to be made in a, in order to get b. It is called the E-distance and deno= 
ted by dla,b). The subscript E will often be dropped. More formaily: 
‚a, with a za 


DN 
Od k Ö 
€ Ela,),k>izo and if there does not exist 


d(a,a}=0 and d(a,b)ek if there exists a chain a 


and a =b such that a 
hete aas de Ee! 


a shorter chain with that property. If no chain exists at all the 
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distance ás per definition infinite. 

The E-distance is properly called a distance since: 
d(a,b) > O and 

i) The reflezive law d(a,a)=0; 


ii) the symmetric law dla, b)=d(b,‚a) and 
iii) the triangle inequality dla, b)td,(b,e) >d(a,e) 
are fuifilled. 


i) is obvious, ii) follows directiy from the symmetry of the error- 
type E and iii) follows from the fact that the concatenation of the 
chains from a to b and from b to ce forms a , not necessarily minimal 
chain from a to c. The definition of distance is a straightforward 
generalization of the Hamming distance for the single bit errors in 
binary codes. 

If ali distances are finite the space U is called connected with 
respect to E. Otherwise U falls apart into connected components. The 


diameter of a connected space U is max dla, b). For the random 
a,bé& U 
errors the diameter of every space is Ì. For the single errors the 
diameter is equal to the dimension of U. 
fhe greatest possible diameter ís [ul - Ì since that is the length of 
the longest chain in U. The following examples show that this diameter 
is possible. Suppose that the code words of U are listed somehow in 
a codebook. Let the type of error be that of taking the list item 
directly preceding or following the correct one (restricted lookup 
errors). Another example is that U consists of a set of consecutive 
integers with respect to the errors of one unit in the arithmetical 
sense. The E-distance of two different words a and b, of an E=-proof 
code C, is at least 2. If it were less, then beEl(a}, but for an 
E-proof code E(a)nC=0 holds. If C admitts a correcting scheme the 
E-distance between any two words is at least 3, for otherwise there 


would exist a word e such that d(a,c}=d(ec,b)} and hence c €E(a)a E(b). 


Â code C is said to have a minimum distance k when min dla, b)sk. 
a,;beG, 
afb 


in view of the definition of distance it will be clear that a code 
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with minimum distance Zetl will admitt a correcting saheme for e errors 
of the type E. It also will detect Ze or jess errors. 

Ân E-ball of radius k and canef a is the set of ail words « satisfying 
dla,n) sk. It is denoted by S„(a,k) The difference between a bali 
with radius Ì and the error sets Ela) is clearly that the latter does 
not contain a: hence sa, 1=E(a) U fa} 

Let E“ (a) be the set of those words obtainable from a by making 1 mis= 
stakes of the type E‚ but which cannot be obtained by making fewer than 
i errors. Thus ee and el (ay=E(a) and B (a)=S, (a, )-S (a, il), 

for i> O0. 


@ | 
i 
Conversely, S,(a,o)= KEZ E (a) for e> 0. From the definition it 


ana 
an 


follows immediately that E“(a)nE (a)=0 for iÁj and therefore 


i 
spteo|= LIE). 
i=0 | 
An error=type is called uniform if [E(a)| =|E(b)| for alt a,b U. 
Single errors are of the uniform type, whereas the transpositions are 


neon=uniform (EC13)={3h and E(22)=0). An error=type is strongly uniform 
if |e*(a)f =[E (b)| for all i> O and all a,be U. 


1.1 Some 


upper bounds for minimum distance codes. 





From the definition of an error correcting code it follows that 
Sla,i)n S(b,ijzO for all a,beC with afb. An immediate consequence is 
the relation: 
[u | al ) LED 

ae C 
Therefore; | 


theorem Ì.1.0 The redundancy of a minimum distance 3 code ís at least 
ig, { } | Sla, 1) | / Ic | ) digits. 

aëC 
This bound is a generalization of the sphere packing bound, as it is 
knewn in the literature on the binary codes with respect to the single 
bit errors. For uniform error=types the bound is simplified into 
lg, | sCa,1) bs 
The obvious generalization is the 


theorem 1.1.1 The redundancy of a minimum distance 2e+1 code is at 


least Lg ) | sca,el/Ic|) digits. 
| ae C 
Proof: Let a and b be two words of a minimum distance Ze+t code C, then 


Sla,e)n S(b,e)zO0, for otherwise there would exist a word ce U with the 
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property d(a,c) <e and d(b,c} <e. From the triangle inequality it then 
would follow that d(a,b) < 2e, which contradicts the minimum distance 
property of C. The disjunctness of the gapheres and the relation 


Rn Sla,els< U give ) |S(a,e) | < |U |. After division by el 
aeC is 


and by taking the m-logarithm of both sides of this relation the 
theorem is found. These theorems are especialiy helpful for proving 

the nonexistence of certain codes. 

The Hamming codes are examples of perfect minimum distance 3 binary 
codes. These codes enable the correction of single errors. Perfect- 
ness of codes is a mathematical nicety, which has from a practical 
point of view the disadvantage that all other errors are “corrected! 

by introducing another error. The point is of course that the non= 
perfect codes have a higher redundancy. Perfect binary codes for correct= 
ing more than Ì single error are collector items. (45). 

Finding an optimal error correcting code is a matter of packing as many 
balls S(a,e) as possible in the space U, For a strongly uniform error= 
type a close=packed code is necessarily optimal. For a non=uniform 
error=type it is conceivable that a perfect code is not optimal since 
the latter might have many small balls whereas the perfect one possibly 
covers U with a few large balls. For the even minimum distance codes 

it is not simply a matter of packing balls since these may now overlap 
each other. The question is how this overlapping can be done effectively. 
Consider two points a and b of a minimum distance Ze code C in the 
space U. Suppose that d(a,b)=2e, then S(a,e-i})n S(b,e-1}z0 and 

E“ (a) AE (b)#0. The space U is split up into 3 types of points i.e. 

4) The points of the balls with center in C and radius e-l, 

ii) The points contained in the sets E“ (a) with ae C, 

iii)The other points. 


Denate these mutuaily exclusive sets by U,,U, and U, respectively. 


Ae 3 
e 
Then U,= „U , S(a,e,-1) and U,= A E (a). 
The following relations hold: 
in e 
KA = ) | S(a,e-1) 15 |U, Le ) |E (a) | and | vs | >0. 


ac C aeC 


Now define cla,e}) as the maximal number of points a, such that 
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d(a,a.)=e and d(a,,a.)? 2e for cla,e)> i> joo. It will be called the 
i dee peen ge 
covering index of a. Since each xe U, can be in at most c{xz‚,e} sets 


e 
E“(a) the relation ) c(x‚e) > |E (a) | holds. Let c_ = max c{x,e), 


xe U, ae C xe U 
then | ‚| «ce > } elx‚e)> } | ea) | 
je 5 xe U, ae C 


Consequent ly | u-v|ee=lvgtugl -ee2lUpl eer Ì le“ ca). Combining this 
relation with [u-u.l =lvl - ) 5 (a,e-1| gijes 
ae C 


e 
ul/jel> ) {[S(a,e-1)| +|E (a)|/e} /C|. 
ae C 
In this way a lower bound for the redundancy has been found. For 
strongiy uniform error=types this bound is simplified into 


e 
[S(a,e-1)|+|E (a)l /ee where a is chosen arbitrarily in C. 





The result is formulated as: 
theorem 1.1.2 The redundancy of a minimum distance Ze cade is at least 
e EEn 
ig, { ) (| S(a,e-1)| +|E Ca)l/e, BAREN digits. 
aeC i 
For strongly uniform error=types this bound is simplified into 


lg,,C |sla,e-1)| + [E“(a) |/e), with acC. 





let the type of the single errors be denoted by E,- let n be the 
dimension of the space U of m=ary words. The set E,‚ (a) consists of 


ali words which differ from a on only one position. There are mei 
possibilities per position and thus |E, @) | =n (m=1) and |E af =O m1)". 


Two words a and b of an E‚ proof code differ therefore on at least 

two places and that is why such a code is sometimes called bidifferent. 
Theorem 1.2.0 The redundancy of a bidifferent code in an m-ary space 

U is at least Ì digit. 

Proof: Suppose 1e, U ul Lel) <1, then |[C|> | ul /memS, where n is the 
dimension of U. Since there are mj different words with n= positions, 
it follows that C contains at least two words say a and b, which are 
identical on the first n=i positions. Hence be E‚ (a) and consequentiy 

C is not E‚ proof. 


Theorem 1.2.5 There do exist bidifferent mrary codes with a redun= 


dancy of Ì digit. 
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Proof: Let U be the space of ali m-ary words with n positions, One may 
assume that the symbols of the words stand for the residue classes 
modulo m. Íf this were not the case one can first make a 1-1-correspondence 
between the symbols and these residue classes. Now let he be a 
word of U and consider the sum s= } a, There are m values possible 
for s and thus the words of U are Artiaea into m classes according to 
that value. These classes alì have the same number of elements. This 

is so since the m different words which are equal an say the first 

n=Ì positions ciearly are in different classes. From this it follows 
that each of these classes is a code with Í digit redundancy. Moreover 
since words differing from each other on only one place cannot have the 
same digit sum modulo m, each one of these codes is bidifferent. In view 
of the preceding theorem they are also optimal. 


Just for curiosities sake two examples of maximal bidifferent codes with 


a higher redundancy wilt be given. 


OOG, 101, 202, 303, 404, 555, 656, 757, 858, 959, 
O1l, 112, 213, 314, 410, 566, 667, 768, 869, 965, 


022, 123, 224, 320, 421, 577, 678, 119, 815, 976, 
033, 134, 230, 331, 432, 588, 689, 785, 886, 987, 


044, 140, 241, 342, 443, 599, 695, 796, 897, 998; 


This is 50 word 3-digit decimal maximal bidifferent code. A similar one 


with 52 words is given in the next example. 


000, 101, 202, 303, 404, 505, 666, 767, 868, 969, 
O11, 112, 213, 314, 415, 510, 677, 778, 879, 976, 
022, 123, 224, 325, 420, 521, 688, 789, 886, 987, 
033, 134, 235, 330, 431, 532, 699, 796, 897, 998, 
044, 145, 240, 341, 442, 543, 
055, 150, 251, 352, 453, 554; 


The construction of an optimal bidifferent code is equivalent with a 
generalization of the problem of the rooks, well-known from recreational 
mathematics. (see 27, p. 240). It is the problem of how to place m rooks 
on an meth order chessboard so that no rook can capture any other one in 
a singie move. The generalization uses a n=dimensional board with genera= 


iized rooks, say hyperrooks. The equivalence is obvious since the set U 
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of all m-ary words can be taken as an n=dimensional meth order chessboard. 
A hyperrook placed on a field a covers exactly the fields of the set 

E‚ (a). A bidifferent code is therefore a set of fields where hyperrooks 
can be placed such that they cannot take each other in one move. The 
code is maximal if there is no uncovered field left in U. The optimal 
codes, having me words are the solutions of the rook problem. For n=2 
all maximal solutions are optimal, but the examples mentioned above show 
that such is no longer the case for n >2. Other error-types correspond 
in this terminology with fancy chessman, having esoteric ways of moving. 
Theorem 1.2.0 can also be derived from theorem Ì.1.2. The covering 
index cla,1)} is obviously n för every a, since this is the maximal 
number of points differing from a on one siacs and from each other on 
two places. Thus as |E, (a) |= n{m-Ì) holds it follows that the minimum 
redundancy is Lg, Cen (m-1) /m) el digits. 

Theorem 1.2.2 The redundancy of a n-digit minimum distance 2eti m-ary 
code is at least le, { À GD m1") digits. 
The proof follows at she ton theorem Ì.Ì.1 by substituting CG) m1)” 
tor |r* (a) 

This beund is known in the binary case from Hamming (17). Let the size 
of an n=digit meary code with minimum distance d with respect to the 
single errors be denoted by A(m,‚n,‚d). 

Hamming proves in the same paper that A(2,n,2e)zA(2,n-1,2e-1). His 
reasoning is simple: Suppose a minimum distance 2e code with n bits 

is given. By chopping off one bit a n-1 bit code is formed. Obviously 
this code has at least a minimum distance 2Ze-i since the chopped-off 
bit contributed at most one unit to the distance. Thus A(2,n=1,2e=l) 

> A(2,n,2e}. Conversely when a minimum distance 2e-l code with n=l 

bit is given, a n=th bit can be added such that the number of ones in 
each code word becomes even (parity check}. The words which were at 

a distance Ze-l from each other are now necessarily different on the 
n=th position, as 2e- is odd. For the pairs which had already 

a greater distance the n=th bit is irrelevant, so that a minimum 
distance 2e code with n bits is derived. Hence A(2,n,2e) > A(2,n=i,2e-l) 
and therefore A(2,n,2Ze}zA(2,n=-i,2e=i}. Only the first part of this 


reasoning is valid for the higher number bases and thus: 
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Theorem 1.2.3 Alm,‚n=-1,2e-1) >A(m,n,2e) for m>2, 

Á counter example will show that the converse of the theorem above ís 
not true. Consider a 4-digit minimum distance 3 ternary code. Substitu= 
tion in Î.2.2 gives A(3,4,3) <3°/(1+4(3-1))=9. The 9 word code: 

0000, Olli, 0222, 1021, 1102, 1210, 2012, 2120, 2201 is therefore opti= 
mal. 


This code is in fact a Graeco-Latin square of order 3. 





A S=digit minimum distance 4 ternary code would, if it had 9 words, be 
the same as 3 Latin squares of order 3, such that each pair forms a 
Graeco=Latin square of the same order. But it is well-known that this 
is impossible (see Ryser 42 p.80). 

With the aid of theorem 1.1.2 an upper bound for A{m,n,2e} can be 
derived which is better than the combination of the theorems 1.2.2 and 
1.2.3, if efn and not worse if e|n. 


e-Ì 
Theorem 1.2.4 Alm,n,2e)<m/( }) C) (m1) +) (m1) /entier(n/e)). 
1=0 | 


Proof: The maximal number of words differing from a certain word a 

on e@e piaces and from each other on at teast 2e places is clearly equal 
Á 

to entier(n/e) . The theorem now follows by substitution of E‚ (a) and Ce 


That this bound is an Ímprovement can be seen by comparison. 


in n i n e Sn n=1 Â 
L Gm) +) (m1) /entier(n/e) >m. ]( ‚2 (m-i) 


i=0 i=0 
n e Es n=Ì n Í 
or (_)(m-1) /entierln/e)> ) (m( 2 27 (m1) = 
3 i=0 
e-Ì e-l 
n=Ì il n=i i ‚n=l e 
LCD (m-1) En Ga melk dm). 


i=0 
Finally OET =n/er entier(n/e) which is obviously true and the 


equality sign therefore only holds if eln. 
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In connection with A(@,n,2e)zA(2,n=-1,2e-1) an interesting cerollary 
follows. For, Al2,n,2et1)=A(2,nt1,2e+2)< 2 / Ì 6 where the equality 
sign only holds if e+1 Inti, hence aldi 

Corollary :Â cliose=packed minimum distance 2eti binary code is only 
possible if etl inti. 

This corollary easily shaws the nonvexistence of a 90 hit minimum 
distance 5 code. Though Dez and CC are both valid 

C the first condition is Handels sphere packing condition whereas 
the second comes from theorem Ì of Shapiro and Slotnick (45)), 

241 | 9041 does not hold. 

The upper bound for A(3,5,4) betonen 8 AGsrsrtore Jeje 243/31 7.9. 
The true value for A(3,5,4) is 6 as is shown by the example: 00000, 
OLi1t, 11202, 12120, 20221, 22012. That this code is optimal can be 
seen as follows: Suppose that 3 words had the same symbol on the same 
position, say a O on the first position. Then these words have to differ 
on all other positions. Since permutations of the symbols per position 
do gat change the distance between the words, those 3 words may be 
taken as OOO00O, OlÌÌl, 02222. These words form however a maximal code, 
for each S=-digit word has to have at least 2 equal symbols on the last 
á places and therefore cannot differ on 4 places with those 3 words. 
Censequentiy each symbol can occur at most twice on each position, 
which is so in the example. Hence 6 is the maximal number of code 
words. The same reasoning shows: 

Theorem 1.2.5 m(m-i)> Alm,‚mt2,mtl). 

For m=4 this gives 12 >A(4,6,5), but this bound can be sharpened by 
remarking that, for m >3 an optimal code cannot have 2(m=i) words 
which share on a certain position m=l symbols of one kind, say a 0 
and mel symbols of another kind, say a Ì. As the first mel words one 


may again take - 


Ke Ö 
me À n L 
0 me 2 . 8 . me 


and as the second me words one may take: 
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met 2 


In each of the words, starting with a "1", the m-th symbol (i.e.''m-1"') 
has to occur at least twice, as there are mtl places left and as the 
symbols from O to m=-2 may occur only once in each of those words. But 
the meth symbol itself may occur only once on each position in the 
words having already a "1" in common. Thus there are 2(m-i) positions 
required and 2(m=-1}> mti for m> 3. Thus: 

theorem 1.2.6 (m1) + (m1) (m-2)= (m1) > Am, m2, met) for m>3. That this 


bound is sharp, at least for m=4, is shown by the example: 


oooooo 
OL1111 
0:42:32 
133210 
120331 
230123 
213302 
3310832 
08 


Thus A(4,6,5)=9. 


For m >â& a better upper bound is given in: 

Theorem 1.2.7 If d >n(m-1)/m then md/(md-n(m-1))> Alm,n,d). 

Proof: let an n=digit mrary code with minimum distance d have k words: 
Any pair of words has on at most n-d places the same digits. Cail an 
occurrence of equal digits on a same position a match. Since the 
number of word pairs is 5, there are at most (n-A) 5) matches in 
the code. Now let oE be the number of code words which have the 


i=th digit on the j=th position. The number of matches on the j=th 
m k,. 
position is then ) CC *9), Now the minimum of this sum is reached 
izi 2 mn 5 
if ali nn are equal, for it is weil=known that min( ) x‚) with x‚> 0 
iz 
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m 
and ) x.=k is reached for x_=k/m. Thus 
izi à jk 
m k m 
1 J 2 2 2 
) eid = Ya Pk, 9/2 zmlk Am )/2-k/2 = (k°/m-k)/2. 
en Of gep 


2 
So that the total number of matches is at least n(k /m-k}/2 and thus 


(n-d) À) zm /m-)/2 has to hold. Division by k/2 gives 

(n=d) Ck-1) on (k/m-1)=n (kem) /m. After shuffling terms md>kCmd=(m-i)n} 

the theorem is proved. 

Corollary:m(m+1}/2>A(m,m+2,mt1). For m5 this bound is better than the one 


of theorem 1.2.6. It is not known to the author whether A(5,7,6)=15 is true. 


In the binary case the bound of theorem 1.2.7 is known as the Plotkin 
bound (see 38). This bound is also known to be true if m is a prime 


power (see 37). 


gits. 





Adding a check digit to the code words is perhaps the best known method 
for introducing redundancy. A check digit is a digit which is determined 
by the other digits. The latter are free to take any value and are for 
that reason called information digits. Let M(m,‚n) be the set of ali 

meary code words with n dicits. If a code word of M(m,‚n) is extended by 

a check digit then it becomes a member of M(m,‚n+1). Thus M{m,nt#i} contains 
a subset C of such extended code words and |c|=lMm,n)| =m Mm, n+1)| /m. 
The redundancy of C is 1 digit. The introduction of a check digit is an 
arderly way to define a subset with a redundary of 1 digit. If the check 
digit can he expressed as a function which admitts a simple computation, 
it may also be a concise way of doing it. Moreover it is often possible 
to derive the detecting properties from the properties of the function. 
It is in general not true that every code with Í digit redundancy can 

be considered as a code with a check digit. Examples are the 3 bit binary 
code {000,001,1041,111} and the 2 digit ternary codes (10,21,20 } or 
(00,01, 11} . That for instance the third bit in the first code cannot 

be a check bit follows from the observation that in the first two words 
the first two bits are equal and therefore cannot give different check 


bita, For bidifferentcodes however the converse is in fact true. 
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Theorem 1.3.0 In a bidifferent code with Ì digit redundancy each digit 
can be considered as check digit. 
Proof: The i=th digit can be considered as a check digit if all words 

are different on the other n=l positions. This is obviously so because 
of the bidifference of the code. 

fhere are various ways to define a check digit or what amounts to the same, 
to define a function on a finite set. The most general one is the method 
of the tabie look-up. The arguments of the function are simply listed and 
the proper function=value is entered behind each argument. The method 
though general is not attractive for applications, the very simple ones 
excluded. It is in cases of any interest virtually impossible te con= 
struct a code with precribed properties. Moreover it is only possible 
by means of a large memory to have automatic detection of errors. It is 
therefore natural to apply check digits defined by some sort of a formula, 
or an algorithm. À simple example is the parity check in the binary case. 
The check bit is chosen in such a way that the number of Ì's ín the code 
words (check bit included) is even. The parity check is well-known and 
finds wide appiication in the computer design. Binary codes however 
are not popular for use by human beings. It is mentioned here only as 
an illustration. It may be interesting to note that the complement of 

the parity check is a disjoint code, called the imparity check. The space 
U ef the n-bít code words is divided into two equal parts, i.e. the 
words with an even number of Ì's (the parity check code) and the words 
with an odd number of Ì's (the imparity check code). The codes are 
essentiaily the same since the inversion of one bit (i.e. interchanging 
Í and O) makes the codes identical. The parity check can easily be ge- 
neralized for an arbitrary number base m. Let a, be the symbol on the 
i=th position of an h-D-digit m-ary code werd. An n=th digit can then be 
found such that. ) ad {mod m) or An ) a,- fhis check is called 

the straight dente m check. It was üsed in the proof of theorem 1.2.1. 
For me it is the parity check. The detecting properties of this check 
will be discussed in ch.2. At present it only serves as an example for 
the generation of a check digit by means of a formula. In that light 

it is important to note that the check digit can be found recursively 


as follows: Take cz and c.=c, „=a. for i>0, then c is the check 
Ö 1 kek "A n=1 
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digit. If the digits of a word are fed into a cyclic m=counter one after 
the other, the counter wijl end in the initial state after that the check 
digit has been entered. The state of the circuits is in each stage giving 
the value of the check digit no matter how many digite are fed into it. 
This is a very desirable property for the technical implementation of 
check digit verifiers. The example of the cyclic m-counter is simple 

in the sense that its reaction is independent of the positions of the 
various digits. The drawback is of course that the code is of no high 
guality. Later on codes will be introduced with crooked and position 
dependent ‘“m-counters!. Under a crooked mecounter is understood a 
(sequential) circuit with m states S, and m possible inputs a;Osi <mel 
such that two conditions are fulfilled i.e.: i} Any input acting upon 

the circuit ín different states has to bring the circuit into a different 
state. ii) From any state, different inputs have to bring the circuit 
into different states. 

let a, bring the circuit from the state sE into the state S and let 


this be denoted by te The state transitien matrix ld defined 


Ì 
by EE is a Latin square. This is so because by i}) no row contains 
an element twice and by ii} the same holds for the columns. Hence in 
each row and in each column every symbol {state} occurs just once, 


The equation s a ee is therefore uniguely soivable. Representing the 


k 
states and the inputs by the same set of m symbols gives an algebraic 
structure which is known as a quasi group (see 16 p.7). A quasi group 
is a set Q in which a binary operation * is defined such that the 
equations axx=b and xxa =b are both unigueiy solvabie for x if a,be@. 
Relatively little is known about Latin squares or quasi groups.Of 
importance is the known (see Ì4). Theorem 1.3.1 A quasi group } 

in which the associative law ax({bxchaxb)}xe holds is a group. Ás a 
matter of f&ct this may be taken as the definitien of a group, in which 
case it is of course no theorem. Ït is possible te define a crooked 

m check by means of a quasi group (Q,X) as follows: 

Â} Choose two elements c, and En arbitrarily in Q. 


Ö 


ii) Define c, for O Si <n=l recursively with c, _=C.Xx a. …; 
Á ikl i 14-d 


is the i=th digit of an(n=-ikdigit code word with symbois from Q. 


where a, 
Í 


dij) The solution of ee Xx x is the check digit a: The check 


n= 
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equation of such a crooked m check would be CKegra, dra)... ra =C 
n n 


Theorem Í.3.2 Every crooked m check is E,‚=proof. 


Ì 


Proof: Let aaa, be a word of the code satisfying 


Co. lepra, de. Da, Ie. a ze and let the i=th digit, after making 
n 
a single error, be a: If this erroneous word also belonged to the code 
then (...({c xa,)x...)xa!)x...)xa =c would hold and hence 
0 1 i n n 


C...le, 4 A 


ed a's on the right of a, and a!, it would follow that c, „xa,sc, „x&; 
d ;f ied dà iel ä 


from the left that a, =a,, contrary to the 


Xa, )x...}xa =l...le. „xa!)x...}xa . By cancelling all unaffect- 
k n iel i n 


and after cancelling e‚1 
hypothesis that a single error was made. 

Since the vast majority of the real life errors is affecting only single 
digits, codes which are not E, proof are of little interest for the applii= 
cations. The codes published until now are mostiy of the crooked sum 

type, or at least can be viewed as such. In fact the straight sum 

check is also a special case, based on the cyelic group. For the 

decimal codes, which are after allt the subject of this monograph, it is 

of interest to know how many Latin squares exist. This number seems 

to be unknown up to now, but it is at least ex10°". 

It is hardly surprising that not all these possibilities have been 

tested on their detecting merits, especially so since most of the 

codes are very hard to analyse. Only two of the quasi groups of order 

10 are associative, and thus admitt, as will be seen later on, a fairly 
easy study. 

The next stage of compiezxzity is that the way of counting is not oniy 
crooked but also dependent on the position of the digit which is fed 

inte the circuit. let n quasi groups be given, all based on the same 

set Q but with different operations Xi for 0 <i <n. The recipe for 


making a check digit is the same as above except that the recurrence 


is now defined by: c a for O<i<n=-l and the check digit 


pe ==, X, n 
iel ii isd 


a, by the eguation Ee These codes are also E,‚=proof as 


X a. 
„ii n=in 1 
can be seen by the same arguments as used for the proof of theorem 
1.3.2, The number of possible decimal codes becomes now hopefully or 
distressingly high. Hopefully because the chance that a desirabie one 
exists is gone up, but distressing because the chance that such a one 


can be found gets down. The latter is even more so since the job of 
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testing each case gets harder too. If a periodic sequence of quasi 
groups is taken the procedure is less complicated and hence more 
manageable. Most of the new codes of chapter 3 fali into this cate= 
gory. Let this period be two, then the recurrence becomes; 

Coi41 Cai oP ai’ Cai Coin 122142 or by taking two steps at the | 
time eier oto 14e The latter relations can oe considered 
as a ternary operation, which written in the functional notation, 


looks like c ). The ternary function g is equi 


2142 78CC4 rani 22142 
valent with a Latin cube of a special type, namely one constructed 
with the aid of two Latin squares. The code would work just as weil if 
it were made with a more general Latin cube. That these exist is shown 
in the next theorem. 

Theorem 1.3.2 There exists a Latin cube not based on two Latin squares. 


Proof: Consider the 4xá4x4 Latin cube with the following four layers 








Ament eemenene eveneens Remmen ê aoe vommneveeee Erveacee: mn ka 
é anmanat vecenctnnn ber eraann vanen 


If this cube g(i,j,k) were based on the Latin squares p(i,j) and 
aCi,j) then gli, j,k)=pli,qlj,k)) would hoid. Now if gli,j',k')e 
gli,j,k) it follows that qlj',k')e=qlj,k) and hence that gli',j',k')e= 
gli',j,k) are true. In therexampie however g(1,0,1)=0=g(i,1,0) but 
g(0,0,1)=l and g(0,1,0)=2. Since g(0,0,O0)=g(1,0,1}=0 and g(0,1,0)=2 
but g(l,i,ij=l it follows that gli, j,k)=plj,ali,k)) does not hold 
either. Nor does g(i,j,k)=plk,qli, j)) because g(1l,0,i)=gl0,1,1)=0 
and g(1,0,0)=i but g(O,1,0)=2. 

A Latin cube like the one mentioned above will be called irreducibie. 
Codes based on irreducibie Latin cubes have not vet come to the 
„proof code can be 


Á 
considered as a Letin hypercube with n dimensions. An n dimensional 


attention of the author. In general an n=digit E 


Latin hypercube is said to be product of two Latin hypercubes if 
(id depli.,..d ali... ). If such is the case the 
gli, EN, pi, $ ke a al’ ED, su 


hypercube g Îs cafled reducible. The coordinates do not have ta occur 
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in the same sequence on both sides of the equation. The factors of a 
reducible hypercube may be reducible too. Thus a n=dimensional hyper= 
cube may be the preduct of n=i Latin squares. If such is the case it 
will be caiÎled completely reducible. All but one of the known codes 
which will be presented in the next chapter are completely reducible 
Latin hypercubes. Âs such they all admitt a simple graphical respresent= 


ation which consists of a “staircase! of Latin squares as shown on page 37. 


The check digit belonging to the code word he ande is found as follows: 


First select the top entry a, then take a, in the column headed by a 


After that a is searched in the same row as a, but in the next square, 
a, is found in the same column as Ae but in the square below and so 

on Finally the check digit is found at the righthand side of the last 
square in the Same row as ag in the example the check digit of 671465 

is found to be Á. The idea of the Latin staircase is probable very old. 
It can be found already in the papers of W. Friedman the renown American 
eryptanalyst (13). The method is good for field use and for instructional 


purposes, 
1.4. Check equations. 


Ín general it is advantageous not to use the functional relation between 
the check digit and the information digits in its explicit form 


a=fla,,...,a ), but to use an implicit form gla, ,-..,a, )=constant 


n=i 
instead. Because of theorem 1.3.0 the two forms are equivalent for B 


proof codes with one digit redundancy. For codes with a higher redun= 
dancy however the check equation is more general. The popular modulo 

Íl check for decimal codes is an example. This check will be discussed 
at fength in chapter 2. Here it is sufficient to note that this code 
in its most gemnon form is defined as the set C of all words satisfying 


the equation ) (-1)a,=0 (mod 11). Now a» or any other a is not 
i=l dé n=j i ä 
always solvable from the equation, since ={-l) ) (-1) a, may have 
i=l 
the vafue 10, so that no decimal digit a can satisfy the equation. 


The problem might be solved by narrowing down the range of the function 


fla,s--.,a 4’ in other words by taking a function for which oniy 


“proper! values for the arguments are allowed. 


37 








ststejstetststateletslejerstarajat tels 
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Ânother advantage of the check equation is that it often admitts an 
easier analysis of the various detecting properties,since the check 
digit no longer plays a special role. Also from a technical point of 
view it is as a rule better to have a check procedure which is uniform 
for all digits. For general codes which are irreducible Latin hyper= 
cubes the difference would vanish. It then is only a different way 


of looking at the same thing. 


1.5. A curious 3 digit decimal code. 





It is remarkable that up to now no pure decimal codes, with a redun= 
dancy of one digit are known, which detect ali single errors, all 
transpositions and all twin errors. It will be hard to prove that 
such codes cannot exist, since the proof would have to depend on 
special properties of the number 10, as for other number bases there 
do exist codes with said properties. An example will be given'of a 
gd-digit decimal code which detects not only the error=types mentioned 
above, but also the jump transpositions and the jump twin errors, 

as well as the phonetic errors. This example shows that the non=- 
existence would only be valid for codes with more than 3 digits. 

The three digit code is equivalent with a Latin square, (i.e. single 
error=proof), with certain special properties, Denote the elements 
of the square by on The detection of the transpositions requires 
that: 1) ae for iÁj and 2) if aj then afd. The twin 

‚ for i#fj and that 4) if 


%. 


error detection requires that: 3) arie 
As then afk for k#j. Finally the detection of the jump transpo= 
sitions requires that: 5) if A then Been for ik, whereas for 

the detection of the jump twin errors it as necessary that: 6) if 

Ean then ar for kAi. 

4) is equivalent with the condition that each row has, as a permutation 
of the column entries, at most one fixed point. Since each column 
contains ali 10 decimals, every entry is fixed in some row and 

never in two rows, Hence each row permutation has to have exactly 


one fixed point. The same conclusion holds for the column permutations 


with regard to 6). The condition 3) requires that the main diagonal 
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of the square is a permutation of the decimai digits. By putting a; 
ali three conditions are fulfilled. This takes care of the main dia= 
gonal. The remaining 90 places outside the main diagonal can be divi 


ded into 30 triplets satisfying a, jk, a, zi, a with ij, jék and 


ki. All triplets, considered as Dn tripiets, should be different. 
The conditions 1),2) and 5) are fulfilled if the triplets can be | 
arranged in 30 blocks, each containing 3 decimals and each decimal 
occuring in 9 blocks. Moreover each pair has to occur only once as an 
ordered pair. The design on the next page fulfils the requirements. 

Each of the 30 blocks has to be oriented to define the ordered pairs. 
There are 16 ways to assign the orientation, since the blocks (rows) 
fall apart into four orientation independent classes, namely {0:3} ; 
{4:21} 5 {22,24,26,28} : {23,25,27,29} , The orientation can per ciass 
be inverted, independent of the other classes. An inversion of all 
classes results in a reflexion of the entire square, with respect to 

the main diagonal. Hence 8 different solutions are obtained in this 

way. The resulting square, corresponding with the orientation given 

to the right of the blockdesign, ís written out below. The code does 

not detect all phonetic errors, but by interchanging È and 4, it does. 
The square obtained after carrying out this exchange is given next 

to the original one. From a practical point of view the code is per= 
haps not recommendable since none of the triple transcription errors 
aaa *”bbb is detected. Â further disadvantage is that none of the cyclic 
errors abc *bca is detected. This error=type might very well be expected 


for the small 3-digit code words. 
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The resulting square: 


Fhe block design: 





ijlt) 








The interchanged square: 





Lit| Ean Kah Alt 


EIKEN 2| sieg lol lul 
ee | Si lalale! 


L 





This vulnerability for new error types is again an example of the 


designers dilemma, 


that the design constructed by virtue of some 


&Ì 


regularity, is weak because of that very regularity. The irregular 
designs however, though often more numerous, are as a rule harder 

to find. Moreover the verification of the properties usually is also 
more difficult For curiosities sake, an irregular code which has the 
same virtues as the regular one above, will be given Though none 

of the triple transcription errors are detected by this code, it turns 
out that 82.8% of the cyclic errors is detected, which makes the irre= 


gular code superior to the regular one. 


The irregular square: 






To extend such a 3-digit code to a 4„digit one is a tremendous task, 
It would be equivalent with the construction of a Latin cube satisfying 


a number of asymmetry conditions. 


1.6. Sing g_decimal codes. 


According to the upper bound given in theorem Ì.2.2 with e=l, the 
maximum number of code words im a minimum distance 3 m=ary code with 

n digits, is: mi /C+(m-1) xn). This means that for n= the upper bound 
is Ì, giving the notorious, perfect l=word code. For n=mti a perfect 
code with 2 check digits and mel information digits seems possible, 

as 14+-(m=i) stats. It is well-known (see 37) that these codes 

exist if m is the power of a prime. It is not known (at least not to 
the author) whether these codes exist for other m. Some light throws 


the next theorem: 


42 


Theorem 1.6.0. If a minimal distance 3 m-ary code with mtl digits and 
me words exists, then there exists a Graeco=- latin square of the 
m=th order. 

Proof: Consider the subset of the code words ending with m-3 fixed 
digits, say zero's on the last m-3 places. This is a set of ra words 
which differ from each other on at least 3 places of the first 4. 

From this it follws that on the first 2 places all u omt ttsns 
occur exactly once. Let these digits be denoted by i and j and let 

the digit èn the third and the fourth place be Ais and Di respecti= 
vely. The matrices a; and bs are the orthogonal Latin squares re= 
quired to prove the theorem. 

Ten years ago it would have heen conjectured that this theorem dis= 
proved the existence of such a perfect code for m=l0, but now only the 
case m=6 can be discarded. For the existence of LOxlO Graeco-Latin 
squares see Ryser (42) chapter 7. Such a Graeco-Latin square forms 

a 4 digit minimum distance 3 decimal code with 100 words. Even the 
existance of a 5 digit minimum distance 3 decimal code with 1000 

words is unknown. The existence of the perfect code with me 
words isaninteresting combinatorial problem. Its place among the 


other problems like the existence of finite projective planes is 


not yet clear. Consider the statements: 


A:m is a power of a prime number. 

B:There exists a field with m elements. 

C:;There exists a finite projective geometry with mil points on each 
line. 

D:There exist m-l mutualiy orthogonal mxm Latin squares, 

E:There exists a pair of orthogonal Latin squares, i.e. a Graeco= 
Latin square, of the meth order. 

F:There exists a (perfect) minimum distance 3 meary code with ge 


words of mel digits. 


The following implications are known:(see diagram on the next page} 
AB: BC; CD; D*E; and FE; DC; B> Á; and B> F. The author 


could not prove C*F nor D*F, For m=6 E has been disproved and 
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hence F and all others. For m=10, A and B are not true, E is true and 


C‚,D and F are open. 





LED je-| ASB) 
se Fé 


For practical applications a double moduto il check with error 
correcting capacity is available, with all the disadvantages of the 


modulo ÌÌ method. 
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In this chapter some known decimal error detecting codes will be 
described. Also a few new ones, designed by the author, are included. 
The codes are compared on the basis of certain conditions, set forth 
in section 2,1. In judging the codes it should be borne in mind that 
at the time that these codes were designed these requirements were 
often not known, Or more precisely that the designers were not aware 
that those criteria were of importance. Two examples will suffice 

to make this point clear. The 1.B.M. code of section 2.3.0 was 
designed to detect the single errors as well as the transpositions. 
The code does detect the single errors for 100%, but the transpositions 
for only 98%, since the transposition of O and 9 escapes detection. In 
some applications therefore the code words in which a O0 and a 9 occur 
on adjacent positions, are omitted {see 33). À more serious flaw 
however is that this code does not detect the jump transpositions 

at all, a flaw which could have been overcome relatively easy. Á 
second example is the biguinary code of section 2.3.1 This code 

was designed with the same objective as the [.B.M. code, as a matter 
of fact the purpose was to do better on the transpositions. The code 
was a success in the sense that the detection rate of the transpositions 
is 100%, but it was sheer luck that the jump transpositions did not 
escape detection entirely. That the new codes of the section 2.3 are 
doing better, is therefore partly due to the fact that they are 
taylored for the requirements set forth in section 2.1. More convinc= 
ing are therefore the tests on the set of 12112 real life errors 

drawn from the daily operations of a clearing institute. All these 
errors are made in code numbers with 6 decimals. The errors of 


forgotten decimals have been eliminated beforehand. 


4.1 T 





it has already been stressed before that the requirements for a good 


code cannot be set absolutely. They depend on the type of equipment 
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used and also on (the knowledge of} the error habits of the human 
beings involved. The methods explained in the chapters 3, 4 and 5 
are however also applicable to many other requirements than the ones 
which follow: 
i} The single errors are considered to be most important. The weight of 
800 points is given to these errors. 
2} Next come the transpositions (of adjacent digits} with a weight of 
100 points. 
3} The twin errors, aa>bb, get 10 points, just as 
4} the jump transpositions, abc>chba. 
5} The separated twin errors, abachbc, is a less important class 
getting 5 points.These errors will be called jump twin errors. 
6} The phonetic errors, 1330 and the like, also get 5 points. These 
errors may be for some languages of little importance, but this is 
only an opinion, since no corroborative evidence is available. 
In the next section the detection rates are often given without a proof, 
since these will be given, except for the trivial cases, in the chapters 
3, 4 and 5. The mathematical formulation of the requirements ís also 
postponed, since such a formulation is dependent on the method employed. 
A1} but one of the E‚-proof codes of the next section are of the complete” 
Iy reducible type. As such they form a set of words satisfying the 
recursion Ce, = eri 
the digits of the word and where the decimals form a quasi group with 


‚ where eo and Ge are fixed decimals, a, are 


respect to each of the operators Kie ÂA burst of two errors on the 
positions í and i + 1 is detected if and only if 

Cera # Ceria no matter what value c,_; has. 
This inequality cannot be true for all possible bursts, for if a,s A41 
and a, are given, there always exists an ai Hi such that the equality 
holds. That the requirements for detecting at the same time all trans= 
positions, all twin errors and afl jump transpositions are not per se 
too heavy, is shown by the example in section Ì.5. 

For decimal codes based on a group, with an operation denoted by +, 


a general form of the operators Xx, may be defined by ax‚b = a + f(b). 


i 
Ef £, Ge) = x for all xk and for all í then the code is a straight check. 
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If Ë, is an automorphism of the group then the code is called a weighted 
check. This is reasonable because the automorphisms satisfy the relation: 
ÊCx + y) = f(x) + fy) and therefore behave like a weight. The weights 
are, just like the automorphisms, linear operators and they are therefore 
often written as a multiplier;f{(x)=fx. 

The permutations form also a group, the symmetric group. Ín the symmetric 
group the automorphisms form a subgroup. The product of two permutations 
f and g is defined as the permutation which brings x into the element 
flglxd}, it is denoted by fg. Hence fg(x) = f(g(x}} holds (see 55). If 

Ë, as EE, for a fixed permutation f the code is called progressive, 
Progressive codes are periodic. ÌÏf the period is 2 then the code is 
called alternating. Ín an alternating code the operations atb and at+f{b} 
are applied alternatively. Â weighted aïiternating code however may also 
be considered as a straight code in which the same operation is used. 
throughout. This operation is defined by axb = f(a}+b.. The same remark 
holds for all weighted progressive codes, with the same definition for 
the single operator. This construction is merely a version of the algo= 
rithm of Horner. Ít is an illustration of the fact that some of the 
properties defined above do not exclude each other, and are thus not 
suitable for a classification. 

The discussions gbove are valid for all groups of order 10. It is well=- 
known from the theory of groups that there are 2 groups of order 10 
available (see Hall, 16, p. 52) i.e. the eyclic group of order 10 and the 
dihedral group of order 10. The first one is the group of the rotations 


of a regular 1O=gon, denoted by C Ïts operation is also the addition 


modulo 10 or what is the same, he eme in a cyclie 10-counter. The 
cyeliie groups are abelian, that is the commutative law axb = b“Xa holds. 
The second group is the dihedral group, that is the group of the trans= 
formations of the pentagon. This group contains not only the rotations, 
but also the reflexions. It is denoted by Da: The group is not commu= 
tative, for rotating the pentagon over say 712 degrees and then reflecting 
it, is not the same as first reflecting it and then rotating it over 72 
degrees. The idea of applying groups is, that the condition for the 


error detection is simplified by virtue of the associative law. As will 
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be seen in chapter 3, the use of the cyclic group gives a still greater 
simplifzication 

Much of the effort, spent on the construction of decimal codes centers 
on the detection of the transposition errors. This typica’: human type 
of error has been bothering the cryptologists for a long time, it is 
mentioned by Friedman as a psychological lapsus calami as early as 1932 
(see 13}. Since the straight modulo 10 check obviously is insensitive 
for transpositions and in view of the fact that the altermating sum, 
modulo an odd number, met the requirement, one naturally tried to do 
something like that for the decimal codes too. The I B.M. code of 

2.3.0 seems to be the first trial in that direction Unfortunately 

the detection appeared not to be flawless and when some authors proved, 
that no decimal E, proof code with one redundant decimal could be 
transposition=proof, the codes based on the eleven check became very 
popular. Even nowadays many people still believe that one has to use 
modulo 11 checks for the detection of the transpositions. Actually 

the non-existence proof mentioned above is only valid for codes based 
on the cyclic group Co with generalized weights which are .ndependent 
of the words itself. The codes of 2.3.4 on page 56 are examples of codes 
designed for detecting transpositions which are unfortunately no longer 
E‚-proof. 

The biquinary codes of chapter 5 designed by the present author are 
however both E‚ and transposition=proof. The first one has many faces. 
It is an alternating code in the sense that it can be defined by appiy- 
ing two quasi groups alternatively. Ít has also an interpretation as a 
biquinary code and as a code based on addition modulo 10 with weights 
and checkvalue Ce) depending on the value of the digits Last but not 
least the code can be interpreted as a code based on the dihedral group 
with generalized weights and as such it would fall under 2.3.2 The 
generalized biquinary code however loses this interpretation (see 
chapter 5). The merit of the generalized biquinary code is that it does 
much better in the detection of the twin errors and the jump transposi- 
tions. Finally the application of the dihedral group D, turns out to 


5 


— and transposition=proof codes. Some of these do 


give scores of 5, 
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even better than all other decimal codes mentioned above (see chapter 4). 
Up to now no decimal code, scoring 100% in all the categories, has come 
to the attention of the author. He was also unable to prove that such a 
code does not exist. Such a proof would have to depend on properties of 
the number 10 since for other number bases codes like that do exist. 
Moreover in section 1.5 an example of a 3-digit decimal code scoring 
100% in all S categories, has been given. The performance of the modulo 
il check is difficult to compare with that of the pure decimal codes 
since their redundancy is higher. For that reason alone these codes 
detect 1% more in the realm of the random errors. À G-digit decimal 
code satisfying a modulo 1Ì check equation has at most 90910 words 
whereas a pure decimal code will have 100000 words. Decimal codes 
applied in situations where a modulo 11 check could also be used, have 
therefore a hidden redundancy, which is not taken into account in the 
performance comparison tables. The main disadvantage of the eleven 
checks is that the lexicographical ordering, according to the inform- 
ation digits, of the code shows gaps, in contrad'stinction to a pure 
decimal code. This may be a disadvantage for the efficiency of the 
file-handling and storage, but when it comes ta application in an 
existing system it implies a recoding of about 10% of the code words. 
The resulting inconvenience for customers and the potential danger 

for more errors makes the application of the modulo il check in those 
cases very unattractive. Ín some applications this difficulty is over= 
come by giving no check digit in those cases where the 10-th symbol 
would be required. The blank is then playing the role of the 10-th 
symbol. The drawback is that the code will no longer have a fixed 
length. In other applications the O0 has to play the double role of 

the 0 and the 10-th symbol. However the remedy is worse than the 


disease, as the code stops being E‚-proof. 





The decimal codes, discussed in this chapter, may be classified as 


follows: 
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1} The non-E, “proof codes. 
1.1} Codes modulo k, with 10 > k. 
1.2} Codes modulo 10, using weights divisable by 2 or 5. 
1.3} The Bull type codes, using the sum of 2 alternating sums, each 


with a different modulus smaller than 10. 


APRA 


1.4} Various modulo 11 checks, in which the "0" and the "10 are 

identified. 

2} The E,‚-proof codes with 1 decimal redundancy. All but one of the 
codes, mentioned here, are of the completely reducible type and as 
such they can be defined by the Latin staircase method. In the text 
however other definitions, admitting an easier analysis, will be 
employed. 

2.1} Codes based on the addition modulo 10, i.e. the eyclie group C 
2.1.1} The straight sum check. 
2.1.2} The alternating sum check. 


2.1.3} The weighted sum checks. 


10° 


2.1.4} Sum checks with generalized weights. 

2.2} Biquinary codes. 

2.2.1} Alternating biguinary codes. 
2.2.2} Generalized biquinary codes. 

2.3} Codes based on the muitiplication in the dihedral group of the 
pentagon, i.e. D- 
2.3.1} Straight product check. 
2.3.2} ‘Weighted product checks. 
2. 


3.3} Periodic product checks with generalized weights. 


be 


3.4} Non-periodic product checks with generalized weights. 
3) The E, "proof codes with a higher redundancy than one decimal. 
3.1} Checks based on the addition modulo k, with k > 10, 
3.1.1} Various modulo 1Ì checks. 


3.1.2} Checks modulo k, with k > 11, 
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2.3.0 Ch 


The straightforward generalizatian of the parity check is tie straight 
modulo 10-check. This code consists of all words satisfying 

a, ta, tt... ta =c (mod 10}. Although the code detects all single 
errors (see 1.2}, the obvious disadvantage is that no transpositions 
are detected. The twin errors, aa>bb, are caught for 88.9%, since 

Za = 2b (mod 10) if a = b + 5 (mod 10). It is a poor consolation that 
the phonetic errors are detected for 100%. 

Â considerable improvement gives the alternating check modulo 10. The 
check equation for the alternating check is: a, ” a, + az Ten FC 
(mod 10}. This check detects 8 out of the 9 transpositions, since 

a, — a = a, = a, (mod 10} only holds if 2a, = 2a, (mod 10} or 


1 2 2 1 1 2 


equivalentiy a, = a, + 5 (mod 10). The jump transpositions still remain 


undetected. A major drawback is that the twin errors now escape detect= 
ion completely, since a - a=b=-b for all a and b. The difficulty is 
clearly that 10 contains a factor 2 and in fact for an odd modulus 

this type of check would detect all transpositions. The alternating 
check of the form a, + 2a, + ag + 2a, +... =c (mod 10) does detect 
all transpositions, but is unattractive since it is not E‚ proof, as 

an error of 5 units, on the even positions, does not change the sum 
moduio 10. The root of the trouble is that the function: 2x (mod 10} 
has always an even value. 


The f.B.M. code is an intelligent trial to improve this situation, by 

2x if 2x < 10 

2x-9 if 2x 2 10 

added to the product 2x (mod 10)}, hence 

fz pbl | 
0246813579 

a, + fla) + a 


defining a permutation f by: f(x} = í (the carry is 
h. The said Ï.B.M code consists of all words satisfying: 
Pres IE : Ì Í 
EN, fla _) | ce (mod 10}. This code was a big 
stride forward, but it did not detect the transpositions completely, 
as the transposition of 0 and 8 goes by unnoticed, giving a detection 
rate of 97,8%. Because of its alternating character none of the jump 
transpositions is detected. Âs will be shown in chapter 3, it is not 


accidental that 1 out of the 45 possible transpositions remains un= 


detected by codes using fixed permutations in combination with the 


ol 


addition modulo 10. 

it is however possible to do better with respect to the jump transpo= 
sitions, by using sequences of (generalized)} weights with a higher 

period than 2. It is well-known from number theory that the powers of 

an arbitrary number modulo n, form a periodic sequence. The number 

9 (= -1l (mod 10}} has for instance the period 2 modulo 10, as of = 81 

and 81 = 1 (mod 10}. The period of 3 modulo ÌO is 4, as can hbe easily 
verified. The code defined by: & 3 a, = a, + 3a, + Ja, En Ta, Hg bg = 
= Cc (mod 10} besides being E‚=proof, detects 8 out of the 9 transpositions 


since 3'a, + en À 3e, (mod 10) if 2a, = 2a,,, (mod 10). 


i+1 i+d 
The jump transpositions are also detected for 88.9% since 


1 142 Ì +2 
3 a, + 3 Asa 3 a, + 3 a, (mod 10} leads to 8a, ed Ba, ‚o 


+2 
(mod 10}. Also the twin errors are detected for 88.9% as 


3ta + gk 


Ì 


a= zp + db (mod 10} is equivalent with 4a = 4b (mod 10). 
But unfortunately now the jump twin errors give trouble, as 

3la + chaie = 3 (1+9}a = 0 (mod 10} for all a. 

Ít is of course not necessary that the weights form a geometric progression 
modulo 10 and one sometimes sees a weighted code, defined by 

a, Ee 3a, + Ta, ta, + 3a, +. = Ce (mod 10}. This code, which is of 

period 3, is equally good on the single errors and the transposítions, 

but does better on the jump twin errors than the former one did. Of 

the jump twin errors it detects 88,9% on 2 out of the 3 positions and 

0% on the third, giving a nett result of 59,3%. The drawback is that 

the same rate now holds for the twin errors, instead of the 88,9%. 

Since 1, 3, 7, 9 are the only proper weights which are admissable in 

view of the complete detection of the single errors and since virtually 

all possible combinations are tried it is reasonable to turn the 

attention to codes with generalized weights. Án obvious improvement 

of the Ì B.M. code is to make its period higher by using powers of the 
permutation f for the successive weights. This generalization gives a 

code defined by: a, + fla) + la + B) +... = Ce (mod 10). 


The code which is obviously É,=-proof, still has a detection rate of 


1 
97.8% for the transpositions. The detection of the jump transpositions 


52 


depends on how often ela) + £1*2 0) elo) + £l*é a) (mod 10} holds 
for a # b. Put A = Ea and B = Eb Shen the condition becomes: 

A + f(B) = B + f(A)} (mod 10} or A — f{A} = B - f{B}. The function 

X — f(X) has the values 0-0, 1-4, 2-8, 3-3, 4-7, 5-2, 6-6, 7-1, 8-5 
and 9-9 or modulo 10: 0, 7, 4, 0, 7, 3, 0, 6, 3, 0, so that 8 out of 
the 45 combinations a, b fulfill the equation and hence 82.2% of the 
jump transpositions will be detected. In a similar way the detectien 
rate of the twin errors is found to be 93.3% and for the jump twin 
errors 95.6%. The phonetic errors have a detection rate of 89,6%, 


In chapter 3 it will be shown that the permutation g defined by 
0123456789 
6802479135 
for the single errors, the transpositions and the twin errors as f 


gi} = f(x) + 6, or g = í | has the same rate of detection 


has. On the other error types the code defined by: È gla) = C 

(mod 10) is better than the one defined with f. It detects the jump 
transpositions and the jump twin errors both for 95.6% and the 

phonetic errors for 90.3%. An oculist from the Leiden University, 

Dr. A.D. Colenbrander, who needed an error detecting code for a 
hostpital administration was not satisfied with the codes known to 

him and designed an interesting and remarkably good one as follows: 

The 10 non=zero residue classes modulo 1á form a group under multipli= 
cation. Ín particular multiplication by 2 modulo 11 gives a permutation 
of these 10 classes. Coding the class 10 by O0 and the classes 


1, 2, ………, 9 by Ì, 2, ..., 9 thus gives a permutation of the decimals 
de emd 

9246801357 
where g is an arbitrary permutation. In particular g can be chosen 


Ì. The code is defined by È f (gla) = Q (mod 10}, 


so that the code becomes 5 hi + a.) = 0 (mod 10), where ji + a, is 

to be taken modulo 10 too. The latter code detects 100% of the single 
errors, 97.8% of the transpositions, 93.3% of the twin errors, 95.6% 

of the jump transpositions and jump twin errors and 100% of the 
phonetic errors. His way of making a permutation resembling multi- 
plication by 2 is apparently more fortunate than the one of the Ì.B.M. 
code. His code is a close analogue of the ‘best modulo 11 code defined 
by 5 2'a, = ec (mod 11). It is also meritorious that he uses the 


5 7 Li NE à: d : 
permutation f in a “geometric progression. Ít is rather unsatisfactory 


33 


to try haphazardly some permutations and moreover it is by no means 
necessary to limit the generalized weights to the powers of one single 
permutation. In chapter 3 an exhaustive search for the most favourable 
combination of permutations has therefore been carried through. It 
turned out that theoretically the code defined by: È f‚ (a) = C (mod 10) 
where the fs are given in section 3.5, is one of the best. This check 
is shown to detect 97.8% of the transpositions and of the twin errors, 
and 95.6% of the jump transpositions and the jump twin errors, whereas 
the phonetic errors are detected for 97,9%. So far the story of the 


codes modulo 10. 


2.3.1 Biqu 





The first pure decimal code which is both É,- and transposition=proof, 


1 
is perhaps less powerful than the best codes described in the previous 
section, but it is interesting for other reasons. Íts weakness lies in 
the rather poor detection rate for the twin errors and the jump trans” 
positions, namely 55.5% and 66.7% respectively. The code is described 
at length in chapter 5 and it will suffice here to mention that the 
phonetic errors are detected for 100% and the jump twin errors for only 
66,7%. The version by Benard, which is also described in chapter 5, has 
the same properties except for the twin error detection which is only 
27,8%. The generalization, which is of a later date (see 5.3), scores 
also 100% for the single errors, the transpositions and the phonetic 
errors. Â detection rate of 88,9% holds for the twin errors, and the 
jump transpositions, whereas the jump twin errors are detected for 
66,7%. One of the merits of these biguinary codes is that they lend 


themselves to a relatively simple technical implementation. 


2.3.2 The dihedral codes 


In chapter 4 codes of a quite different nature are described. Instead of 
addition modulo 10 the multiplication in the dihedral group D,, of the 
order 10, is employed. This group is non=abelian, since axb = bxa does 
not always hold true. It follows therefore that the straight product 


code defined by: a, Xx a, B Be 8 Nin c in D, does not miss all trans= 
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positions. In fact 2 out of the 3 transpositions are detected. The same 
fraction of the twin errors, the jump transpositions and the jump twin 


errors is detected. The aiternating product check, defined by: 
et zE atd 
a, Xx a, Xx Ag X a, Xx... == eC in D, is,in this group,no improvement, 


since now 5 out of the 9 transpositions and all the twin errors escape 
detection. Checks using an analogon of the weights are also far from 
satisfactory, but there are many combinations of generalized weights 
which do yield excellent results. It is shown in chapter 4 that 100% 
detection of the transpositions can be achieved ín combination with 

95.6% of detection for the twin errors and 94.2% for the jump trans= 
positions and twin errors. There exists a progressive code of the form: 
f(a) X f(a) X 1 a) ME a EER D, which has the qualities mentioned 
above and which scores 95.3% in the phonetic errors, with 

br msn 


1576283094 


den ’ he ee 
form £, (a) Xx f (a) fla) An e in D,, which are even phonetic 


2 
error=proof. Ín 4,5 it is described how the permutations f, can be 


‚ There also exist many non-progressive codes of the 


constructed. 

Comparing the codes of chapter 3 and chapter 4 is not quite as simple 

as the analysis given above suggests. A code, missing 1 out of the 45 
transpositions does not necessarily miss 1/45-th of the transpositions, 
as the assumption that the transpositions are uniformiy distributed is 
very unlikely. Much more about this distribution should be known in order 


to be able to construct better codes, which capitalize upon this fact. 


2.3.3 Codes modulo k, 





The best known higher modulus codes are theones modulo ÌÌ, The dis= 
advantages of the modulo Ìl codes in general has been discussed in the 
section 2.1. Here only the detecting qualities will be subjected to 

analysis. Not to be recommended is the straight modulo ÌÁ check, defined 

by à a, = C {mod 14}, because it misses all transpositions. The alternating 
checks modulo 11, Ìike È (1e, = ec (mod 11) and a, + 2a, 4 ag + Za, + e= 
z= € (mod 11}, though better, cannot be recommended either because they still 


miss all jump transpositions. There are however scores of possibilities 


for good weighted codes modulo 11. All the non=zero weights are admissablie 
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with respect to the detection of the single errors. Transposing the 
digits of the i=th and the j-th position is detected, provided that 
w. # w_. (mod 11}, where w 

Í j k 
position. The twin errors on the same positions are detected if 


is the weight of the digit on the k-th 


LA + LE # O (mod 11}. The main interest is of course to find sequences 
of weights fulfilling the 2 requirements for j = i + 1 and j = i + 2 

at least. The arithmetic progression modulo ii: 1, 2, 3, 4, 5, 6, 7, 8, 
9, 10, 1,25 etc. (the QC has to be skipped} is often applied. A flaw 
of this choice is that the 5 and the 6 are on adjacent positions, so 
that not all twin errors are detected. Another slight disadvantage 

is that not all phonetic errors are detected, since ix e= i + (i+i)x 
(mod 11} holds for x = =i (mod 11} and hence only for i = Ì all 
phonetic errors are detected and otherwise 7 out of the 8. The geometric 
progression modulo 11 of the powers of 2 is better. Not only is this 
code twin error=proof, but it does as by miracle detect all phonetic 


RE = 2 el + ag (mod 11} holds only if 


errors, since at ex + 2 
x= 2x + 1 (mod Ì1) or x = 10 (mod 11), which is impossible since 

9 > x > 0. 

Not so lucky is the progressive code with w‚= 3 since zis = 3 + ze 
(mod 11} holds, if x = 5 holds. Beckley (see 2} denounces both the 
arithmetic and the geometric progression= and he recomments a progression- 
free set of weights. His argument is that the progressive weights are 
vulnerable for the type of error, like 2560004-2056004-2005604-+2000564, 
called shift errors. For each choice of progression there are certain 
combinations which can be shifted freely, that is these combinations 

are such that all the shifts are not detected. For the Beckley choice 

of weights there are combinations of digits which are not detected in 

case of a single shift, but which are in fact detected in case a double 
shift occurs. On the other hand there are also pairs of digits which 

ave detected in case of a single shift and not detected for the double 
shift. Ït is questionable whether it has an influence on the average 
number of undetected errors, unless one presupposes a higher frequency 


of use of the vulnerable combinations. The weights given by him do not 


quite meet the specifications since the combinations 13; 26; 39; 41; 
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94; 67; 82; 95 are all immune for both the single and the double shift 

on the positions 9 and 8, as can be verified easily. His weights are 

9, 10, 7, 8, 4, 6, 3, 5, 2, 1 and 10x3 + 7x9 = 7x3 + 8x9 = 8X3 + 4x9 

(mod 11}. 

Codes using a modulus higher than ll are possible in cases where a 
higher redundancy is admissable. Íf one has to protect a 7 digit code, 

of which only 5-10 words are needed, then it is perhaps advisable to 
employ an 8 digit code, satisfying a check equation modulo 19. By doing 
so it becomes possible to detect random errors for about 95% automatical= 


iy at the input, instead of during the processing. 


2.3.4 Codes which are not E‚ -proof 


There may be cases where the application of a modulus below 10 is 


attractive, even though these codes cannot be &,=proof. The case of a 


1 
check equation modulo 2 is of interest, since it gives rise to a code 
detecting all restricted single errors (i.e. single errors of Ì unit, 
like 6+7 or 4+3}. The code may be useful for ‘small sets, say less 
than 500 items, occurring on questionaires. In general it is a good 
policy to use the natural redundancy for error detection. Íf one has to 
code 1400 items one would need 4 decimal digits anyhow. By using codes 
with check equation modulo 7, one gets a certain protection without 
extending the length of the code words. Âs soon as one adds a check 
digit, it is of course inefficient to use a check modulo 7. Only one 
case came to the attention of the present author. The code in question 
is defined by all the words with the property that the decimal value 
peet 


is divisable by 7. Hence zip? 10 a, for all code words a, aa, 


and since 10 = 3 (mod 7} the weights w,‚ satisfy Ww‚= gini (mod 7}. 
Single errors are not detected if a, = a, (mod 7), that is if a, 
equal O, Ì or 2 and a, equals 7, 8 or 9 respectively. Assuming an 
uniform distribution this gives a detection rate of 1/15. The codes 
modulo 8 will yield (under the same assumption} a much better rate, 
namely 1/45, since only the error from O to 9 will remain unnoticed. 


According to the error samples mentioned in the introduction this does 
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not hold true since the combined frequencies of 0*7; 18 and 2+9 are 
much lower than the frequency of the transcription error 0*9. Ít goes 
without saying that the frequencies of the converse errors have also 
to be taken into account. Only for completeness sake some of the codes 
modulo 9 will be included in the comparative code charts. Up to now 

no codes using another group than the cyclic one seem to have been | 
applied. It is doubtful whether much improvement can be achieved in 
that way. If reliable frequency tables were known, then it would 
certainly be possible to improve the single error detection rate 
considearbly. This could be done by recoding the digits so that 0 and 
9 are represented by 2 symbols with a lower transition frequency. 

Also for completeness sake and perhaps as a warning some non=E, "proof 
codes will be discussed. 

First of all one sometimes sees codes modulo 10, which use degenerated 
weights like 2, 4, 6, 8 or even 5. The even weights donot detect errors 
like asa+5, so that 1 out of the 9 single errors on that position 
escapes. Positions on which the weight 5 is used admit single errors 
for which the parity is unchanged, so that 4 out of the 9 single 
errors slip through. Most notorious are the codes defined by 

E ia, = « (mod 10); £ 2e, = C (mod 10} and the codes with the weights 
121213 or 1234678. Still another type of codes which cannot be recommend- 
ed are the double modulus alternating codes (see 11} called Bull codes 

for short. These codes were originally introduced as E‚” and transposition= 
proof codes. They are defined as follows: het p and q be 2 integers satis= 
fying 11 > pta >p, q > 2 and let oen L ta, (mod p} and 

sn = Y Da, (mod qì, with p > 8 20 and q > Ze > 0, From the 
assumptions it follows that 9 > pri+q=i > Ee > Q holds, so that 

KAn + es can be used as a check digit. The underlying idea probably 

was that as soon as p or q was odd, the transpositions will be detected 

by one of the two equations. The combinations 3,5; 4,5; 3,7; 4,7 and 5,6 
are recommended (see 11} and also 3,8 can be tried. A further analysis 
shows however that the claims are not justified, ín another patent, codes 
are proposed in which not the sum, but the number on sh Pe is used as 
check symbol, which is of course no longer decimal. The application of 


a check symbol with that many values opens the possibilities for godes 
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far superior, so that this variant cannot be recommended either. 
Finally the codes modulo ÌÌ with a decimal check symbol with the O0 in 
the double role of 10 and 0, are included primarily as a warning, since 
this twist destroys all the good qualities which the modulo ÍÌ codes 


may have. 


& 


Several codes have been tried on the sample of 12112 errors in 6 digit 
words. This sample is too smalt to give significant results for the 
better cod. 5. This js especially so since there are a few pairs which 
occur with a multiplicity of about 20 and even one pair (903559145379) 
with a multiplicity of 89. A second test has therefore been performed on 
the non=single errors after removal of all duplicates. This smaller 
example consists of 1665 double errors and 471 multiple errors. Only 

E,‚ proof codes are tested on these 2136 errors. The numbers of undetected 
errors per check system are listed in the second table below. In the same 
table the mathematical expectations per 1000 errors are given for the 
various error=types like the transpositions, the twin errors, the jump 
transpositions and twin errors and the phonetic errors and finally 

the random errors. These expectations are based on the assumption that 
the various possible transpositions etc. are equally likely, which is 
certainiy not the case in the present sample. The last two columns of the 
same table give the percentage of the undetected errors with respect to 
the non=singie errors and with respect to all errors as calculated from 
the mathematical expectations. The check systems are listed in descend= 


ing order of the number of undetected errors from the sample. 
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Table of test results on the 12112 pairs of 6-digits words. 


Number of not detected | 
Check system single | double | multiple} total 
_jerzors errors errors | 


La.=c{mod 9} | 84 
i | dd 5 

La,=c{mod 11) with 10=0 ee 88 

Lia,=c(mod 10) En 

La. Ec(mod 10) | 

Lal =c{mod 11) 

r dta =c( mod 10) 

Bull Îype 4,5 

Bulì type 4,7 

Weighted 121212 modulo 10 

Weighted 121212 modulo 9 

Bull type 3,7 

eed ke a, Eg 9) 

Bul tybe 3, 

za” a, =e(mod 5 

Alternating dihedral code 

Weighted 212121 modulo 10 

Bull type 5,6 

E(-1}"a.=c{mod 11} with 1020 

Straight dihedral code 

Weighted 121212 mod.1l; 1020 

Bull type 3,8 

Lg) ta. slm 10} 

Weighted 313131 modulo 10 

Weighted 137137 modulo 10 

z3ja a, =c{mod 7) 

r3t a, =c{mod 10) 

53 a =c{mod 11), with 10=0 

r2taizelmod 11}, with 10=0 

lia,=c{mod 11}, with 10=0 

r(-ijda, zc{mod 11) 

Weighted dihedral code 

Biquinary code, Benard version 

Progressive dihedral code 

ÍBM code 

Weighted 212121 modulo 11 

Weighted 3131381 modulo 11 

First biguinary code 

Generalized [BM code 

Generalized biquinary code 

Ef, (a,}=c{mod 10) 

Moâirted generalized IBM code 

Efi (a, )=c(mod 10) (colenbrander) | 

Best Ainedral code 0 | EO 56 

r3*a,=c{mod 11) 44 

Lia, „Se(mod 11} 

Weighted 463521 mod.11 (Beckley) 

rata. ‚=e{mod 11} 
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Test on sample of | measured theoretical estimates 


2136 non-single | frequencies lof undetected errors 
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The general impression is that the tests give a good confirmation of the 
theory. There are a few discrepancies which show that the assumption of 
the uniform error distribution is invalid. One would for instance expect 
that the check equation à Oa = C (mod 9} would be superior to 

Ë 3 a, = Ce (mod 7}. The first check does not detect the single errors 
099 and 9-0, whereas the second one does not detect 07; 1>8; 29 and 
720; 81; 9-2, but the latter six together have a much lower frequency 
than the first two. The check equation L 2 f(a) = € (mod 9}, where f 

is a permutation such that f(0}) = O and f£(7} = 9, would yield a much 
better result on the single errors (17 instead of 621} than 2 CH = C 
(mod 9). It is however dangerous to build a code on the enen 

that the transcription errors 0*7 and 7*0 are per se rare. The danger 
may be illustrated by the typical high frequency of the transcription 
errors 7*9 and 9*7, which probably arises from the phonetic resemblance 
of "zeven" and “negen which is Dutch for 7 and 9. The obvious conclusion 


is that oniy the E,=proof codes are of practical value. 


Though the RN is not sufficient for the ultimate choice of the 
“best! code, it is clear that only the lower half of the second table 
contains the serious candidates. It is also clear that the modulo 1á 
codes are by no means the only answer to the detection problem. if there 
are reasons for avoiding the modulo Il codes, then there are certainly 
competative pure decimal codes available to the system designer. This 


is especially so since the modulo ÌÌ codes look better because they 


profit from the hidden redundancy which they require (see section 2.1), 
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Let D be a set with 10 elements and let + be a binary operation 
defined on that set such that (D,+}) is a group. Consider all ordered 
n=tuples of elements from D, in other words the set D Let Ë, for 
l <is<n, be n functions with value and =rgument both in D. Hence for 
xeD also f, GO eD, which is denoted by fe DP. Let furthermore c be 
an arbitrarily chosen element of D and let a code C be defined as the 


a which satisfy: 


n 
subset of D consisting of the n=tuples ajang --a, 


n 

) f_ (a )=c, hence C= {a a ...a | ) f_ (a.)=c } .The purpose of this 
d RE f 1 2 Pit ii ji 

izl i=1 

chapter is to find the functions f, which yield the "best' codes. 


A function which maps D onto D is called a permutation. Since D is 
finite it is equivalent to define the permutations as one to one 
functions, or as reversabl: functions. The set of all permutations 
of D is denoted by S, hence se DP. More formally S is defined by: 
S= {f [f ED, {tGO)|xED}=D}. 

If f, ge S then the function h defined by h(x)=g(f(x)) also belongs to S. 


En 


The permutation h ís called the product of the permutations f and g 

and is denoted by gf. The set S is a group with respect to that product, 
It is cailed the symmetric group. The identity element will be denoted 
by e, hence e{x)=x for all xeD. The group is not abelian as can be 


seen from the example: 





el =Ì =Ì 
The inverse of permutation f is denoted by f and hence ff =f fze. 


Theorem 3.0 If fes for al i, then the code C is E,=-proof. 


Ì 
Proof: Two words differing only on the j=th position cannot both be= 


jong ta C, since otherwise 
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Jet n j=1 
) f,la,)+f‚(a )t ) f, (a, )= ) f, (a, )+f Ca!)+ ) f(a) 
Lel 3 iz=jti dsl d i=jti 
would hold. By cancelling equal terms from the left and the right, 
a would follow, so that from fe S the contradiction 


a, =a, can be derived, 


fFhe converse of this theorem is not true since a code C may be 


E‚ proof while not allt the functions f, belong to S. Counter example: 


Let the functions Ef, and fn be defined by the table: 





The words a,a,a, satisfying f(a, )tf,la)+flag)=0 obviously cannot 


129 
contain any digit "higher" than 3 and therefore the code is E,‚ “proof 
since from f(a. )j=f (a!) it follows again that a za', for a ‚a! <3. 
x ee * i ái AN see 


In the rest of this chapter it will be assumed that all fs are 
permutations. It will also be assumed in this chapter that the group 
(D,+)} is abelian. It is well-known from the theory of groups that this 
group has to be the cyclic group of order 10. It is also well-known 
that the additive group modulo 10 is cyclic,. Those readers not fa= 
miliar with group theory may therefore interpret the operation + as 
addition modulo 10. In agreement with this interpretation the ele= 


ments of D will be denoted by the decimals, or D= {0,1,2,3,4,5,6,7,8,9} 


3.L. Formulation of the requirements. 


The condition that a code C as defined above is transposition=proof is 


that f, la, drf, (a, ff (a, 1E ‚ ) for all a.,a. with a 4 a, 


zl á i i 1 i i’ i+l_, 1 i+l’ 
Now B x and y de xef, En eee y= É, (a, ‚then a,= f, GO) and 
si 1 
ce 
af te), and the Soni IeÂ an eld E RA (x), or sin 
the en is abelian, x= on “G#r- "f, ee f, (y) follows. The con= 


dition has to hold for all z Ee J_ vit x 4 y. In other words the 
function g defined by glx) = x = JN 1f eo has to be a permutation. 


Loosely said x — REE) has to B a permutation or more formaliy 
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(xt, GO|zen }= D. In an analogous way it can be deduced that the 


twin errors are detected if Kr, 4 


Jump transposition detection requires that 


al s 
f(a DAE, rs or) Es 00242) FES a, oo) E, a Era) Fy oo 2) 


=Ì 
f, (x) is a permutation. 


Setting xef, (a) and y=f, (a, ‚) and cancelling the middie terms gives 
_] —Ì 1 

f f — f Tu 
xt, 5 NN GIEYHE, ‚5 í (x) so that now x fo i (x) has to be a permu 


tation. Again in an analogous way the condition for the detection of the 
-l 
jump twin errors can be reduced to the requirement that xtf, of, () 
has to be a permutation. 
Finally the phonetic errors are detected if 
+£ En ë É 5 
f, a)rf, , (O)ÉÉ, (1D fj) holds for a#{0,1. This inequality is how 
ever also valid for OQ and 1, so that, after setting x=f, (a), it follows 
_Ì 
_f É _; ED, 
that x f 1 : GOE, U) f, 70) has to be true for all x 


In the list below the conditions are summarigzed. 


Transpositions (xt, & Go xe} = D 
Twin errors Get, Go|xen) = D 
Jump transpositions (at, pf GO]xeD) = D 
Jump twin errors (tf, of, GO| ze D} = D 
Phonetic errors LI)-E , OOÉ (eef GO) xe D} 





The five requirements are not compatible since the first one contradicts 
_] 

a Î | TE 

the last one. Suppose that {x f1 i 


as f, Def, „OE D, that the last condition is not fulfilled. 


0) | xe D} = D, then it follows, 


Fortunately or rather unfortunately no one of the first four conditions 
can be satisfied because of the next theorem, so that it becomes theore= 
tically possible to satisfy the fifth one. 

Theorem 3.2.0 There does not exist a permutation f of the 2k residue 
classes O,Ì,...,2k-1 such that the function g defined by g(xj=f(x)ex 

is also a permutation, 

Proof: Consider ) AG) HDE )£G)+ ) x=2 ) K=2k (2k-1)=0(mod 2k), 

but if f(x)t+x were a permutation then ) (£()+m)e ) x=k mod 2k) would 
hold. 
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As a corollary it follows that, if in (fe Ga)an [e=O, …,2k=1} just one 
element is missing, say i, then itk (mod 2k) has to appear twice. 
B iyi this th n the permutations + f s7Ì and +f £ 

y appiying is eorem Oo p Ss + zat tf, vof; 
with k=5 it follows that the first four conditions are impossible. This 
is the result referred to in 2.1, which led to the belief that no pure, 


decimaal check could yield a — and transposition=proof code. 


The question remains of how je the ideal can be approximated by using 
addition modulo 10. The next best to being a permutation is that. the set 
{xef (2) | wenimeks only one element. The expression “f(xz)-x is nearly 
a permutation!' will be used in the sequel if such is the case, 


The Ï.B.M. code of &.3 is an example: 





Note that the digit 5 is missing and that the O occurs twice. 

The set of permutations f such that zef(x) is nearly a permutation is 
denoted by P. The subset of P consisting of those permutations f for 
which z+f(x) is also nearly a permutation will be denoted by Q, hence 
QePes. 


If g.eQ and if f_ ‚=g.f. for 1 <i<n and if f, is arbitrarily chosen in 
i i á > Weng 


id dt 
S, then the code defined by ) f, (azc is “as good as possible" in detecting 


ln ke : 
the transpositions and the twin errors. The converse is also true. 


In view of the large number of permutations ( |s| =3628800=105) a 

computer program has been written to find the sets P and Q. Simpiy 
generating ali permutations and rejecting the ones for which zef(x) is not 
nearly a permutation, is not only inefficient but also unimaginative. It 
is much nicer to generate the permutations lexicographicailiy and to test 
while each permutation is built up. Building each permutation by first 
choosing f(O), then f(1) etc. is a multiple stage decision process and the 
idea of dynamic programming can be applied. If for instance f(O),f(1) and 
f{2} are chosen so that Oef (O)z=i=f(ij=2-f(2), then all 7: further cedes 


may be skipped. This would also be the case, according to the corollary 
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of theorem 3.2.0, if O-f(O)=l-f(1)=2-f(2)+5. As will be seen below, a 
further saving, by at least a factor 150, is gained through a study of 
the structure of the sets P and Q. There are four transformations which 


leave both P and Q invariant, and which define equivalence relations 


in these sets, 





The permutations h‚ defined by h_ Goera, with x‚ae D, form a repre- 
sentation of €D,+) in the symmetric group S, since h_ h, Ge) =h, Geb) e= 
xtbta=xtatb=h, Cx). The set H defined by He{h, | ae D} is a subgroup of S 


isomorphic with (D,+). 


If feP (or Q), then the double cosets HfH(={h, fh, |a,be D}) belong to P 

Cor Q)}, since x'+ h_fh, G') = y'+ h_fh, (y'),with x'=Xtb and y'=y+b, follows 
immediately from x+tf(x) = y+f(y). Hence the transformations fh f‚, with 
aeD, leave both P and Q invariant. The sets Hf(= {nt | heH} j Sbrtousty 
contain 10 different permutations, among which there is just one which 


h_sco)f? 25 B_e(g)f(O)=f(0)-f(0)=0 


The search may therefore be limited to the sets Po and Q defined by 
P= {£ [fe p‚f(o)=0} ana Q= (f| feQ,f(0)=0} . Crearly [Pl =|P| /10 
and |Q! = [al /ÌÎO. By setting f(0)=0 there are 9! possibilities left 


has OQ as a fixed point: i.e. 


and the search is cut down by a factor 10. 


The transformations G, defined by G, f=h jeave B, and Qo in= 


sic) te 


variant, since {h_ fh, |ae D}c HfH and since G, f(O)=f(a+0)-f(a)=0. 


f(a) 
That the sets {G,f |a &D}, for feP, contain 10 permutations each is 


true, but not trivial. It hinges on the circumstance that u=f(x) is 
nearly a permutation for all feb For such a func ion there are two 
elements d, and d, such that d,-f(d, =d, f(a). Let these elements be 
called the duplicators. The duplicators of G_(#) are ae and da, 


since f(d,vata)-fla)-(d,-a)=f(d,) =d, ta-f(a)= 


=f(d,) =d, tarf(a)=f(d,-ata)-f(a)-(d,‚-a). If d "d#5 then the ten values 


1 
of a , give 10 different permutations, as they have different dupli= 


cators. But if d, =d,=5 then it is conceivable that f(m)=f (u+5}-f (5) 
since the functions on both sides of the equation have the same dupli- 
cators. However if this were the case, then f(5)= (5+5)-f(5)=- (5), or 


2f(5)=0. But since f(O)=0, it follows that f£(5})=5 so O and 5 are both 
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fixed points of f and hence dupiicators. Substituting Ì for x gives 
f(I)=f(14+5}=5 or f(I)-isf(6)=6, which makes Ì and 6 also duplicators 
and consequently z=f(x) is not nearly a permutation. Now that it has 
been established that each set has to have 10 members the problem remains 
to select appropriate representatives in order to facilitate the search. 
The effect of the transformation G, is that the dupiicators are both | 
shifted moduto 10. Their cyclic distance ís therefore an invariant. Let 
Poi (resp Qs’ be.the subset of P {resp Q) consisting of the permutations 
f such that O and i are the duplicators of x-f(x}). Now each permutation 


fe P-Pos has just one equivalent permutation in one of the classes 


o1’ Foz’ Foz’ Poa: 
tations in P05'4 
and [Po | ze 10) Pos | +5 Pe) . It is much easier to search 


will have two equivalent permu= 


for i<6, 


Pp The permutations of P 


05 


It is therefore only necessary to find the sets Poi 


for the permutätions of P since not only two values are fixed, but also 


Oi’ 
because the test is simpler now, as no more duplications in z=f(x) are 
allowed. Moreover according to the corollary of theorem 3.2.0 f(x)4x+5 
has to hold. 

The search can further be limited by yet another transformation=type, 
which leaves each Poi (and Op) invariant. This transformation is based 


on the automorphisma of the cycliiec group C Ân automorphism is a 


permutation Ò ef the elements of the group es : Ôf{atb}= Ô(a}t (hb). 
The automorphisms form a subgroup of the symmetric group. In the case 
of E70 it is a cyclic group with é& elements. The formula above suggests 
a muitiplication and indeed multiplication modulo 10, by a factor re= 
latively prime with 10, does the trick. These factors are 1,3,7,9. It 
should be noted that if (a,10}41l then ax (mod 10) is not even nearly 

a permutation of x. The automorphisms are generated by 3 (or 7), since 
1=3°, 3-31, 7=3° and 3. The order of an SLS a is defined 

as the lowest positive integer k, such that } a=0. It is well-known 
from the theory of groups that the vatenonitens leave the order in= 
variant. The group Co has 4& elements of the order 10 i.e. 3,9,7,1 

end 4 elements of the order 5, i.e. 2,4,6,8. O is the only element 

of the order Ì and 5 has the order 2. Hence $(O)=0 and Ö(5})=5 for 
each automorphism Ó .To each automorphism Ó there corresponds a trans= 


| _] 
form Ps defined by FE (De 8fS ‚ it is known as the transform of 
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f by Ô. If f is given by a table f= Go % kj d 

1323486789 
then F (B) is obtained by applying óò to all the symbols of both entries 
of the table. (see 8). 
Suppose that feP, and An =j and let g=F ; (£), then viper Be 
Öf(i)= 8 (id=j and g(o)= 88 hóe Ö f(O)= en =0. Moreover, since x=f(x) 
is nearly a permutation, it follows that x=g(x) is so too, by remarking 
ina zee) = 188 Taner “Gr tet d barts TGD ee {y= (9)} 


Hence tn Now it is possib:e to project P on P and P on P by 


O4 02 O3 Ol 
the transforms F. and F- The class Po5 however is left invariant and 
though this third transforwation induces an equivalence in Po5 it is 


hard to capitalize upon this fact. It is also tempting to try to split Poi 


further by using the transformation glx)==f(-x), which projects Poi on 


P_ and to go back to Pai by the transformation of the second type 


hGD)=glz-ijegl+i). The resulting transformation is hlx)=zi=f(i=x). This 
transformation does not necessarily lead to a different permutation, 

as can be seen in the first table below for i=l. In Po5 the transform Fo. 
has also its invarjiants. Ân example is given in the second table below, 


Note that z+f(x) is also nearly a permutation in this second example. 





5e H 

£(x) fe) 
xfx) x-f(x) 
lex =x 
£(l-x) f(x) 
1-f(1-x) | f(x) [0 | 





TEN rn 2 É | melden | EG) ete En 


It may be worthwhile to note that there is yet a fourth transformation [, 
which leaves P (and Q) invariant. It is defined by Det |. The in= 
variance follows by substituting Te for x in the relation | { Ke) }| =9, 
for {f{y}-y} will have the same number of elemerts. Since moreover the 
inverse of a permutation f has the same fixed points as f, also Poi 

(and Oi’ is left invariant. It can happen that ;(f)=f, as is shown by 


the example on page 69. 
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Even though some of the transformations mentioned in this section are 
not used to relieve the search, they still are helpful for checking 
the output. | 

The result of this section is that it is only necessary to find the 
sets Por’ Poa and Po5 Cor Wor Qoa Qs? The set P (or Q) can be 
found afterwards by applying certain transformations on these sets. 


The following equalities hold: |P| =200( Par + [Poa | )+50 [Pos | and 


lq| =200( |Q Qs | )+ 50| Q 


bie os \ 


3.4. The search program. 


In the program the permutations are built by means of a multiple decision 
process. In 8 stages the process is ready, since f(O}=0 and fi )=i 
has to hold if fe Pos: The available function values are stored as à 
chain in an array called chain (-1:9). Each time that a digit j is 
allocated to some f(i) the chain is shortcircuited by the assignment 
chain (k):=chain (j). k is supposed to be the previous value which 
was possible, for O the dummy value =1 is taken as the previous 

one. So chain(=1) refers to the first digit which is still available, 
chain{chain(-1}} to the second one and so on. 

The crucial part of the program is the test for the feasibility of 

an allocation. In a boolean array called difference (0:9) the 
occurrence of a difference i=f(i) is memorized by making difference 
(i-f()) true as some value has been allocated to f(i). The fields 
difference (O} and difference (5) are initialized by true, since O0 
and 5 are not allowed as a difference i=f(i). In order to see whether 
j is possible as value for f(i) the program simpiy tests whether 
difference (i=j) is false or not. 

In the array f(0:9) the allocated value is stored and in the array 


choice (0:9} the value previous to the one allocated is memorized. 
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The latter array is necessary to be able to revoke a decision, if the 
process comes to a stop at a higher level. After revoking a decision the 
feasibility of the next possible value is tested. If there is no next 
value available, the decision at the previous stage. has to be revoked, 
and so on.The process also stops temporarily when a permutation satisfy= 
ing the conditions has been found. It is then counted in the array 

count (1:5) and it can be tested whether the permutation also belongs 

to Q, if so, a print=out is requested. After that, the process is started 
again by revoking the last decision. | 

The flowchart of the program is given on page 71, for the benefit of 
those readers who prefer this less clear but more general description 
over the precise list of instructions. The latter is given all the 

same to make the details available. 

The structure of the program, as given in the flowchart is quite general. 
If for instance the test is skipped, the program will generate all 
permutations in texicographical order. If however f(i)#i is used as 

test condition, the non=concurrent permutations are generated. The 

same flowchart can be used for more complicated probl-ms, like the 
pentomino=fitting=puzzles (see 15). 


In chapter 5 it will be used again. 
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the number | 
‚of solutions | 
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‘begin! 


‘begin' 
‘begin’ 


‘end; 


start; 


test: 


allocate: 


‘begin! 


‘end; 
discard: 


cancel: 


miss: 
‘begin! 
‘end! ; 
‘begin’ 
‘begin’ 
‘end’; 
‘end '; 


ready: 


tend'; 


fend’ 
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‘integer’ d,j,k,x,y; 

‘integer! ‘array! f,choice(O0:9),chain(=1:9), 

count (155); 

‘booltean' ‘array! difference(0:9); 

‘for' j:=1,2,5 'do' | 

‘for’ «:=0 ‘step! 1 ‘until’ 9 ‘do’ 

chain(x):=x+1l; differencelx}):=!'false' 

count CHIO ALI ES 

difference (O):=difference(5):=" true! ; 
xi=chain(=i)j:='if! j ‘equal’ 1 ‘then!’ 2 ‘elise! 1; 

‘if’ j ‘greater! 1 ‘then’ chain(j=1)t:=j+l; 

k:=-l; 

y:=chain(k);d:ex=y; ‘if' d ‘less! O ‘then! d:i=d+10; 

‘if’ difference(d) ‘then’ ‘go to’ miss; 

f(x}:=y; chain(k):=chain(y); choice(x):=k; 

difference(d):='true!; ‘if’ x ‘less! 9 ‘then’ 

Kiextl;'if' x ‘equal’ j ‘then! zi=xti; 

‘go to! start 

count {(j}:=count (ji; 

x:=9 

k:=choicelx); y:i=flx); chain(k):=zy; d:=x=y; 

'if' d ‘less’ O ‘then! d:=d+10;differencel(d):='false!; 

‘if* chain(y) ‘less! 10 ‘then! 

k:=y;'go to’ test 

if’ z ‘greater! i ‘then! 

xiekel; ‘if’ x 'equal' j ‘then! 

'if' j ‘equal’ 1 'then''go to' ready; xiek=l 


‘go to! cancel 


nler(1)}; write( "the number of permutations in pO’); 
type(j); write(” is “); typelcount{j)); 
d:=50N(4w(count{(i}+count(2) )tecount5)); 
write({ the total number of permutations in p is"); 


type{d) 
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Âs a result of the program it turns out that |P = 104 


or) = |Poa) 
= 96. Hence [pl =200(104+104)+50x96=-46400. 
el =4s a! =8 and las =i6, so that 


|Q] =200(2+8)+50x16=2800. 


and that Be 


Furthermore la 


Since Q is not empty it is natural to disregard the rest of P, at least 
temporarily. 


The permutations of Cor are: 


4,0) 


wonnen. 





The transformation f(x)el=-f(l=x) leaves both permutations invariant, 
=| 
but the transformation f(x)af (x) interchanges them. 


The permutations of Qoo are: 


S 


| an 









In 02 the transformation flxjed-f(2=x) interchanges ds and Ag; a, 


sq and 


and q., whereas the pairs q 5 4 


and do’ %s 6 7 


As de and A0’ dg and Ao are each others inverse. 


and Ag q 3 and q 
In Qs the transformations f(x)ef (x+5)+5, Der | and FGO) (TE), 
PEG) e-f(-z) and F‚(EG)D=TE (3x) are of interest. 
Q 


05 has 16 permutations, which are tabulated on the next page. 


DE 


The transformation f(x)sf(x+5)+5 interchanges the pairs: (a, ; >de 


Ca, 9745) ; Ca, 3,4 g) ; Ca, 4 as? ’ (a, 5:49) ; Ca, 77420’ : (a, 5” doa) : (ans Aog) e 
The group of transformations EF A with 1<is<á4, spiits 05 into 6 
classes i.e. 4 classes with 2 SelEhents and 2 classes with each 4 


elements. In each ciass the elements are interchanged cycliciy. 
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The classes are: Car -Aj5) (A, 97400): (4777 A5) (Apgr Ang): (A gr 4o3r 41” 
A4) (a, 34e dig A4) 

The transformation Ì splits Q05 into 4 classes of 2 members each which 
are each others inverse. The pairs are: Ca, 1: 4op) : Ay9r 45): (4,3:493) 
Ca, 4740): (ay Aga): (arr dog): (Aged 1 (A97 495): 


205 
the permutations A4 and ag: 


is generated by means of the transformations mentioned above from 


Define Qs PY Ws ={A1P 4127413 A5 Yet 407 Y21’ 23 }. 
The set Qs is chosen in such a way that the transformation (u) rf (+5)45 


has no equivalent permutations in A Therefore, if Q'=A Loal Wz U 


5 
Va Uns it follows that the set Q' together with the 100 trans= 
formations f(x}-f(xtajtb generate the set Q. In the next section it 
will be seen that this concise description of the set Q, as a by= 
product of the hunt for shortcuts in the search, is a great asset in 
itself. Now that the(first) search is over it is still useful to try 
to get the set Q in a tighter grip by means of the transformation 


group F IE This group splits the set oa Loa into classes of 4 elements 
3 


each. The set {a a} denoted by Qoo has a representative from 


3’ dze As 
each class. The set Qo1V Loa” Los however is split into classes with 2 

i À ne L/ te ha B. 
elements each, if Q05 (a, „+4, 3450’ 423} then Wor Qs s a repre 
sentative of each class, so that Q is known as soon as Q"=0 MU Ae 
is known. By using the inverse of the permutations a still greater 


reduction can be achieved, in fact Q is generated by the set 


la, »4g 447470 } 
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The condition for the detection of the jump transpositions was that 


„ Ì 
+al: (x) is nearly a permutation. In the foregoing 


section it was shown that the code detects the transpositions and the 


the function x-f, 


_] 
Hi e. Q. It is therefore advisable to 


apply permutations Ë, such that g,=f, 


twin errors optimaliy if B, =f, 
_l 
+15 es Q and that the consecutive 
Eg, Ss satisfy the condition: that x=8, 18; 0) is nearly a permutation, 
rn pn 2 + maa myn p Í id f 
or that at least the equality x B, 15; © y B, 15; 9) is vali or 


the minimum number of pairs x,y. If the first is the case the permu- 


16 


tations Bi 1 and B, are said to be matching. A chain of permutations 
ge Q is wanted such that the adjacent g's form matching pairs. An 
obvious strategy for finding such chains is to take a 8, @ at random 
and to search for a eg matching with 8, and so on. This is, especi= 
ally if the matching pairs are scarce, very inefficient. It is better 

to make a catalogue of the matching pairs first and to build the chains 
with the aid of that catalogue afterwards. The matching pairs have to be 
selected out of the 7840000 pairs from QxQ. It seems worthwhile to 
investigate whether the representation of Q by means of the set Q' and 
the transformations f(x)ef(xta)tb yields a saving in labour. Let 

f,f'e Q, then for some g and g' with g‚g'e Q's; fGod=glxtaltb and 
f'{x)=g'(xta)tb' hold. The pair f,f' matches if x=g{(g'(xta')+b'+a)-b 

is nearly a permutation, which is equivalent with the requirement 

that y=g{g'(y)+tc} with b'+azc, is nearly a permutation. Ss: if glxtc) 

and g'{x) match, with g,‚g'e Q' and ceD,‚ then g(xta)tb and g'(zta')+b', 
with b'+a=c, also match. This remark reduces the number of pairs, which 
have to be tested, by a factor 1000. 

A further reduction is obtained with the aid of the transformation F,…, 
since it may be assumed that the second member of the pair lies in Q'. If 
the second member lies outside Q' then the transformation PF. will bring 
it into Q' and from 

x-glg G)to)=7-3k-7 Ig(7(3g! (7-3x) 3e) =T(y-Fe(F 8 (y)+3e)) it follows 
that Pe, Foe match if g,g' do. Hence only the 2800 triplets g,g',‚c 
with geQ'; g'e Q" and ceD have to be tested. A simple program gives 

as sad result that no matching pairs exist. This is in agreement with 
the fact that up to now no code was known (at least to the author} 
which had the property to detect both the transpositions and the jump 
transpositions nearly optimal. In trying to prove that these codes do 


not exist the following example was found: 








Both permutations leave 7 ve. (DG) twin errors undetected, but the 


jump twin errors are detected nearly optimal. 

There are two possible approaches now: 

i) Admitt also permutations which do not optimalize the detection of the 
twin errors or; 

iijsuboptimalize the detection of the jump transpositions. By means of a 
simple computer program ît appears that the permutations of Po 02 and 
Pp are divided with respect to their twin error detection capacity 


05 
according to the following tabie: 






5 6 7 8 9 


undetected twin errors | | 
ol aol 62 | 54 |l 22 l 48 16 | 4 





anar etn 


number of permutations 





Following the first suggestion would thus imply the loss of two more 
twin errors, as the second class is empty. The second suggestion should 
therefore be preferred, if it means the loss of only one more jump 
transposition. Such is indeed the case as will be seen in the sequel. 
Again the search may limited to the triplets g,g'‚c, with ge Q';g'e Q" 
and ceD. A pair g,g' is said to be nearly matching if z=-gg' (@)=y=-gg' (9) 
holds for only two pairs x,y. 

By a modified program all the triplets are now tested to see whether 
there are nearly matching pairs. For the approved triplets it is counted 
how many pairs x,y satisfy z+glg' (u)te)=ytglg' (y)te), It turns out that 


32 pairs are the topscorers, each leaving two jump transpositions and 
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two jump twin errors undetected. Eight permutations are involved in these 
32 pairs. These permutations fall apart into two families. The members of 
one family are the inverse of those of the other family. Within each 
family any two can form a nearly matching pair with an appropriate 


value for c. The permutations of one of the families are denoted by 


h‚‚h‚shash, and h, Geta)t+b nearly matches GEEA DAD: provided that 
b'+az=c. , where c.. is listed below. 
ij 1 





it is easy now to produce chains of any length, consisting of nearly 
matching permutations. Ás a matter of fact h, te, ‚) provides an infi= 
nite chain with equal links, resulting in a (periodic) progressive code. 
Codes based on these chains detect 97,8% of the transpositions, 97.8% 
of the twin errors, 95.6% of the jump transpositions and 95.6% of the 
jump twin errors, The problem of selecting chains which detect the 


phonetic errors optimally is dealt with in the next section. 


Á phonetic error on the positions á and i+l is detected if 


zel. 
Kef, af G)st, 1) 1 for xe D, or 


x=8, GOE, (1) -g, £, (O)) with f‚1°8ifs: 

The righthand side is a fixed value, which has to correspond with the 
missing value of the set (x-g, G9) [xe D} ‚ The Colenbrander codes 
(see section 2.3) defined by [ft (gla, ))=0 (mod 10), with 
0129456189. 


9246801357 
elegant solution. 


f= { and where g is an arbitrary permutation, admitt an 


In Lef (2) | xEDithe value O is missing and by choosing g so that 


f{g(O)d=g(1) it is achieved that zet ODE CECDIELGD is 


19 


equivalent with xÁf(x), which is valid for all x since the zero was 
0123456789 


issing. | = which is the 3 d in the tests 
missing. The choice g { 9736124850) ich is one used in s 
of section 2.4, is a proper one. The general case is more difficult. 
For the functions h, given in the preceding section, this missing 
value is 5, and for h,‚ Gxta)+b it is 5-a-b. Suppose that a chain 
E4,Bor---sB, with the property that the code based on the g's in the 
usual manner, detects all phonetic errors. Suppose furthermore that the 

' £ h_‚h‚‚h 
gs are selected rom { 1 Por 3’ Pa 
suitably transformed as set forth in the section above. The question 


} and that the selected h‚'s are 


arises as to the conditions for prolongation of the chain. Let £,(O)=g 
and f, (Dep and let 8, be a transform of he: If the next link of the 
chain is a transform of hs then it has to be h‚ Gete). The missing 
value of Lux, Gete, ) | xe D} is PE and hence the condition for 
detecting the phonetic errors is p-h, (ate, )=5=e, This implies 

that for each t only certain triplets for p‚q,s, with p‚ge D and 
se{1,2,3,4} , admitt the prolongation of the chain by the permutation 
ds h‚ (atc, ), t 


h, Gete }. After the prolongation a new triple h‚ (ptc 


ts ts 
arises, which does or does not admitt further extension of the chain. 
For each choice of t and s there are 10 pairs p‚q which satisfy the 

condition p-h,‚ (atc, )=S-c, Hence there are altogether 160 tripiets, 


namely 5-e, th, (ate ), q, s which are suitable for chain extension. 


5 
The resulting Ei. after the extension are h‚(Sth, (atc, Jh, (ate, ), 
t, with qe D and s, te{1,2,3,4} . In terms of the theory of graphs 
the problem is to find, possibly circuits, but at least the maximal 
paths in the directed graph defined by the coupling of the triplets. 

The triplets itself are the points of the graph. Let these points be 
denoted by T, with Os<ix<399. The edges of the graph are the 160 
ordered pairs (TT), The vertex T, is called the initial vertex and 
En is called the terminal vertex of the edge. An edge (TT) with a 
terminal vertex, which is not the initial vertex of any other edge, 

is called a twig. 

Obviously a twig cannot be part of a circuit and removing the twigs 
from the graph will not eliminate any circuits. The length of the 


maximal paths however will be deminished by one. The following proce 


dure is applied in order to find the maximal paths: 
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ij) Find the twigs of the graph. This can be done efficientiy by 
having the list of the edges ordered according to the initial vertex. 
ii) Remove the twigs from the edge=list and put them on the first 
pruning=list. 

iii) Repeat the pruning until there are no more twigs left. If the 

graph is pruned away after n prunings then the length of the longest 
chains ir also n, otherwise there has to be a circuit The longest chain 
can be reconstructed from the pruning=lists, starting from the back. 

In the present case it turns out that there are 64 chais having the 
maximal length of 6. These chains are intertwined in a manner shown in 


the drawing below. 
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Unfortunately there does not exist a circuit, but it ‚s possible to 
connect the chains, be it with the introduction of one non-phonetic= 
error=proof link. These improper links are shown in the drawing by he 
dotted lines. The starting points O58 and 174 are obt:inable by taking 


for the initial permutation f, the permutations hj G+3) and h, Ge+8) 


respectively. The first one ene to be the same a the one required 
for making an improper link. Â natural choice for an infinite sequence 

of g's is therefore: hCG) ‚h,Ger2) ‚nh Get5) h, Get5) ‚h, Ge+2) hG5), where 
these six permutatiens have to be repeated in this order The re= 

sulting permutations É, become : fo =h Gerd), £, GO=h. C GH3)+2), 
£Go)ehe (h, (h(at3)+2)+5) and so on. In the table on ‘he next page 


the various permutatdons are given. 


aenema 


h_ (+3) | 
hG2) | 








h 


h_{x+5) 


h_(x+5) 


ino jes las |O 


h, G+2) 


h_(+5) 


DO ie 


EN 


ge 


hts) | 
h‚G+5) 





This code detects 97,8% of the transpositions and the twin errors, 95.6% 
of the jump transpositions and the jump twin errors and about 97,9% of the 
phonetic errors. The latter percentage will be slightly better for short 
codes, since one out of the six positions will miss one of the eight pos= 
sible phonetic errors. 

The period of the code above is 90, since after 15 repetitions the per= 
mutations will be reproduced. This follows from the fact that the cycle 
representations of en is (O}(1)(27469)(583) and hence the order of f, 

is 15. The same method of search might be applied to the permutations, 
which give rise to codes detecting only 42 of the 45 twin errors per 
position. This is not elaborated here, since the next chapters contain 
codes which are superior anyhow. It should be emphasized however that 

the considerations of the chapters 3,4,5 are based on the assumption 

that the various errors are uniformiy distributed. ff e.g. a certain 
transposition abba is rare, it would be an obvious advantage to adapt 

a check equation in such a way that the “missed transposition” will 

be such a rare one. More, reliable statistics from different sources 


will be needed before it is justified to follcw such a strategy. 
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in this chapter the idea of applying the dihedral group of order 10 
will be pursued. 

The eyceliie group of order 10 and the dihedral group are the only 
groups of order 10. The latter one is generated by two elements ô 
and ee, with the generating relations 5° = 8; se = Ee de = ES 


The group is denoted by D since it is a transformation group of the 


; 
pentagon. The stands da the rotations over 72 degrees, whereas 
the e stands for the transformation which turns the plane of the 
pentagon upside down, The generating relations will be self evident 
in this interpretation. The elements of the group represent the 
symmetries of the pentagon, i.e. sd, with 1 < j < 4 are the rotation 
symmetries and sde with 0 < j < 4 are the reflexions with respect 


to the 5 axes. The elements can be coded arbitrarily with the 10 
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decimal digits. In this chapter is chosen for the coding: 
and sJenj+5, for 0 <J <4, 

The dihedral group is non=abelian, as can be seen from the third 
generating relation. For this reason the operation will be written 
as a mulitiplication. The multiplication is denoted by the sign OE 
but this sign is often omitted when the generating elements are 
multiplied, as in ste. 

in the table on the next page the result of the multiplication is 
given. This table, sometimes called Cayley table, can be taken as 
an alternative definition of the group (D,x}). 

Here, as in chapter 3, D stands for the set {1,2,3,4,5,6,7,8,9,0}. 
Note that the digit 0 denotes the multiplicative unit of D 
The group D, has as a subgroup ({o,1,2,3,4l,x) which is a cyclic 


group of the fifth order, Cs: Also do,sl,x) is a cyclic subgroup (C,}). 
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The Cayley table of D 


5 





The unit, 0, has the order 1 and the elements 1, 2, 3, 4 have the 


i 5i 0 
order 5, since 8* = i for 1 < 1 <4 and he = 6°* = 8. The 5 elements 


5, 6, 7, 8, 9 are all of the order 2 since sdesde = sds” Je? es dd = se 
This was to be expected from the geometrical interpretation of the group, 
since reflexions are of the second order. 

The automorphisms of a group leave the order of the elements invariant. 
Moreover an automorphism is determined as soon its effect on the gene= 
rators is known. For the first generator Ô, which is of the fifth order, 
there are 4 possible images. For the second generator ec, which is of the 


second order, there are 5 images feasable. The total number of auto= 


morphisms is 20, since all these combinations are admitted. The 10 


inner automorphisms Tr, defined by r, Ge) = En form a subgroup of 
the group of all automorphisms. The automorphism group of D. will in 
this chapter be denoted by A. A is generated by two elements p and o, 
with the relations oe zz bee ge z a Jp = 0e 


The cycle representation of p and o are respectively: 

(OP C1)(22 (34356789) and (O)(1243}(5)(6798). 

The powers of o form a subgroup of the order 4. 

The elements of A are permutations of D and hence AC S. The permutatjions 


of A are listed in the table below. 





The first 10 permutations form the subgroup of the inner automorphisms. 


4,5 Form 


utation of the requirements 





Analogous to the method of chapter 3, the codes based on D, are 
defined as the set of all code words from De satisfying 

epxfy lay) lands... .xf, (a) Ti for fixed ee, ed and f‚€S 

it was shown in 3.0 that such a code detects all singl: errors. From 
this point on the parallel with chapter 3 is broken in the sense that 
the results are different. The treatment as a whole is analogous and 
in fact, it is possible to formulate some of the proofs in such a way, 


that they are valid for both the cyclic and the dihedral group. For 


didactical reasons it was thought better to make both chapters self= 
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contained. Especially for those readers not familiar with group 
theory, the third chapter would seem to be unnecessarily complicated. 
The condition for the detection of the transpositions is 

x£, (a pret, (a Dxf, 7 Oy) egxeexf, (a, Dxfs j ajde 


Co 1 
for a, # a 


i+l’ 
The common factors on both sides can be cancelled by multiplication 
from the left or from the right by the inverse of those factors, but 
in the resulting inequality: £, (aj dE, 4 aj) £, Ge, dE, 1 25 
the factors containing a, and the ones containing A41 cannot be 
tka SE of x for £, (a, } and y for f, (a. ne gives: 


KX, Liyrt, lx) for all oe with x # y. 


ia Ì + Ë, 
The condition der the ads of the twin errors will become after 


si 
the same reasoning: xxf PGD, on f, (y) for all x‚ye D with 


if i 
x # y. In the latter condition the x and the y are separated. 
The condition for the jump transpositions becomes 


f, adt (a. )xf „a, 5 Á £, (a. _)xf, (a, vi) rfs ales) or after 


iel id i i+2 i+d 
the ba ane f, pe paf hal, 27: f, (a, ,9d-z 
EXYXÉ, Li f, a # ed f (x} for all x‚y,zeD with x # z. In the 


same way the condition Dn the hae of the jump twin errors 


will become: xxyxf, Ex) Á ZxyxE, Lz) for all x‚y,zeD with x #2. 


Both conditions for et jump errors wen Oor since they have to 
hold for all y. It will be seen in section 4.5 that these exacting 
functional relations not only ask much, but also give much (see also 11). 
The phonetic errors are detected if £, Gelk, „, (0) # £, Ux f, 7 GO for 
all x # 0,1 with xe D. Since for 1 and O the inequality is valid any= 
how the provision x # 0,1 may be dropped. 

The permutations Ë, occurring in the check equation may be defined 


recursively by f zz Ef,» with f,€ 5. 


i+1 1 

The permutations 8, are used in the summary of the conditions. 

1} Transpositjons xxg, (9) Á yxe, Cx) for x‚yeD with x #é y. 

2) Twin errors xxg, Go) # yxg, (9) for x‚yeD with x # y. 

3} Jump transpositions KXYXB, 18; (2) rd EXE, „184 CO for x,y,zeD with 


XÁ z. 
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4} Jump twin errors KXyX8, „48, C) # ZxyxE, 484 (22 for x‚y,‚zeD with 
EAN 


5} Phonetic errors xxf, 4 (0) zé £, DJ xg, Go) for xaD. 





The fifth and the first condition are no longer contradictory as they 
were in the cyclic case. 
The proof of the impossibility of the first four conditions is not 


applicable, since D. is not abelian. Ín fact, and this is the advantage 


5 
of the dihedral group, there do exist permutations which satisfy the 
first condition (see 4.4). The twin error detection requires that 
xxg(x} is a permutation for some permutation g. Although the variables 
are separated ín the twin error condition, the non-existence proof of 


3.2 is not applicable, since ÏÌ x= ÏÌ g(x} does not always hold in 
xeD xe 
D.. Unfortunately this does not imply that there exist permutations 


ee 
Ren the requirement. 

Theorem 4.2.0. There does not exist a permutation f, such that zxfx)} 
is also a permutation. 

Proof: Let the digits O0, 1, 2, 3, 4 be called low and the remaining 
digits high, io and hi for short. Suppose that f(x} is k times low 

for low x. Thus k times: f(lo} = lo and hence 5=-k times: f(lo} = hi 
and f(hi} = lo, and thus k times f(hi) = hi. The low digits form a 
subgroup of D, with a factor group of order 2, which means that 

loxlo = hixhi = lo and loxhi = hixlo = hi. From this it follows that 
xxf(x) is 2k times lo and 10-2k times hi. If xxf(x) were a permutation 
then 5 low and 9 high digits had to occur, but 2k = 5 is not true. 


Hence xxf{x} can at best be nearly a permutation. The following 


example shows that this is indeed possible. 
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This particular f does not satisfy the first condition, since 

Oxf{(5} = 5xf{O} = 5. This is not accidental, since it can be shown 

that no permutation f exists, such that xxf(x} is nearly a permutation 
and such that it also satisfies the condition that xxf(y) # yxf(x) 

for all x  y. The proof which is very cumbersome, distinguishing several 
cases, is left out. The fact will be established by the computer search 
anyhow, which in itself is also a proof, distinguishing all cases. 

It is also possible to give an upper bound for the detection rate of 

the jump errors. There are 450 combinations for x,y,z&D with x < z. 

Let g, 


i+1 
xxyxh(z) dé zxyxh(a} and zxyxhla) # zxyxh(z}. As in the proof of 


8, be denoted by h for short. The conditions then become 


theorem 4.2.0, the pairs {x,h(x)) can be put into four classes denoted. 

by A, js with 1,3 elo,1}. The í and the j can be taken as the exponent 

of € of x and h(x} respectively. So if x = ee and h(x) = Sted then 

(x‚h())e A, ‚… Obviously aol + la! + las + lA,,l = 10 and 

lacol + |äorl = lAsol + lAgal = lAool + lAsol = lAor! + lAyal =5- 

Two different classes Ag and A are called complementary if 

itj = K+} (mod 2}. Three cases are considered separately. 

1) Gr‚hlx}) and (z,h(z}} belong to different non-complementary classes. 

Both inequalities are then trivially fulfilled, since the number of 

e's is different on both sides of the sign, no matter what value y has. 

ii} (x,h(x}} and (z,h(z}} belong to the same class, say À,.. Suppose 
ai ci b k a' j ii j 

that x= 8 €; 2 =êô te ; y=s8 e ; ha} =ë ee“; h(z) = 8 ee“ express 

the representations of the various elements as products of the 

generators of D,. The conditions become after substitution. 

steisPekse J el sc.isb kad en s2etsPe ksa Jd B: sSisb. Ee 3 which 

can be converted, using the generating relations of D, into 


EE ‚ 
a+(-1)b+(-1)T Fe! ddtk H, geript la! 1ejtk 


8 and 


gerei at dejtk 4 seren ipr Det dejek 


i+jtk 


After cancelling e it follows that: 


, en f N 
A) Brij B # ok be {mod 5} and 


itk 


weent at Lain ve er tea) 
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These inequalities are independent of b and they can be reduced to: 
1 + Ì + 
a-c # (-1)* ktsen and a-c # (-1)° k 
i+k 
values for y it holds that (-1} = Ì, and for the other 5 values 


(c'-a'}. For 5 out of the 10 


Gj = =Ì. The 2 conditions are split up into a=c # a'-c' (mod 5} 
and a=c = =(a'=c!'} (mod 5} for the first condition and a-c = c'-a! 
(mod 5} and a-c = =(c'-a'} (mod 5} for the second one. The 2 pairs of 


inequalities are apparently equivalent. From x # z it follows in 
this case that a # c (mod 5} and hence a-c = a'-c' (mod 5} and 
a-c = =(a'-c'} (mod 5) cannot both hold at the same time. Thus if one 
of the two holds, then the original inequalities are hoth valid for 
5 of the 10 values for y and otherwise for all 10 values. 
iii) C‚hlx)} and (z,h(z}} belong to complementary classes, say A, ; 
and ben isd respectively. Since either i or Í=i is equal to 0 and 
since the conditions are symmetric with respect to x and z, it may 
be assumed that i = 0 holds. Suppose that x = 88, y= sPek, A 8e; 
hx} = sad, h(z) = send holds. Substitution in the conditions 
gives StePekst LI 4 steabekst ed ana staPekst ed # stesbekst 1d 
which becomes after setting the generating relation at work 

k+l , 


B ei _ _ . 
atb+{-i}"e } j+k 4 6e bt (-i} a tk 


eN and 


getbr(-1) Pa! dek d gebr be! Aeketej, 

Since the exponents of e is the same modulo 2 on both sides of both 
inequalities it is necessary that the exponents of Ô are different 
modulo 5. That is atb (-1)5e Á deels) a (mod 5} and 

Ee # gebeden (mod 5). After sorting the terms both 
equations can be put in the form 2b # ease tes {mod 5}. Hence, 
for each of the two values for k, there is just one of the 5 values 
for b such that the inequality is false. 


The following table gives a survey of the number of undetected jump 


errors. 





The best permutations would be those which score always a 0 in the 

main diagonal. The number of triplets x,y,z, with z > x, which fall 

in the complementary classes is HBD. with d = Arsls Hence out 

of the 450 cases 2(d°+(5-4)°) are bound to remain undetected. The 
function ds is at least 13 so that at most 424 of the 450 
possible jump errors are detected. This would be a detection rate 

of 94,2%. It will be seen in 4.5 that there exist permutations which 
achieve this reault. 

As to the phonetic errors it is sufficient to remark that the detecting 
condition does not contradict the one for the transpositions, so that 

a 100% detection seems feasible. Section 4.6 is devoted to the construct= 


tion of codes which reach that score. 


4,3 Detection rate preserving transformations 





Let U be the set of all permutations f,‚ satisfying xxf(y) # yxfx) 

for all x‚yeD with x #/ y and let V be the subset of U consisting of 
the permutations f, such that xxf(x) = yxf(y) is valid for 2 or less 
(unordered) pairs x,y with x Á y. 

As in the cyclic case there are several transformations (of S} which 
leave U and V invariant, but the situation is more complicated, since 
D, is non-abelian and since the concept of duplicators cannot be used. 
From the fact that D, is a group it follows that multiplication from 
the right (or the left} by a fixed element a, permutes the elements 

of D. Let the resulting permutations be denoted by Ee and 1 respect= 
ively. Hence r Ge) = xXa and LG = axx. The permutations 1, with 
aeD form a subgroup of the symmetric group S, called the left regular 


representation of D: Ffhis subgroup is isomorphic with D since from 


5 î 
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11,5) = 1, (bxx) = ax(bxx} = (axbìxx = lp it follows that 


It =d and since 1 #1, for a # b. The 10 different permutations 
a b axb a b 


Ts with aeD, form also a subgroup of S, called the right regular 


5 This group is also isomorphic with D, since 


Kh Xx eed fl == 
f rom r‚r, Go) r, b} xx bxa Ta, it follows that FF, ® Fpxa’ 


so that the order of multiplication is reversed. The transformations 


representation of D 


R: fr f leave U and V invariant, since from xxf(y} = zxflud it. 
follows that zxf(y}xa = zxr, fy) = zxr, fu) = Zxf(lu}xa. Hence if 
feU,‚V then also r_teu,v. The induced equivalence ciasses all have 
10 different elements and in each class a representative, which has 


0 as a fixed point, can be selected. The permutation r 4 is 
fo} 


equivalent with f and has O as a fixed point. Let Vo and Yo be defined 


by U = lelf(o) = 0, feu} and Vo = {el£(o) = 0, favl. Evidently 
u | = |ul/10 and vo! = |vl/10 hold. The search for permutations 
satisfying the conditions for the detection of the transpositions and 
the twin errors may be limited by setting f(O0} = O0. 
The transformation fl f does not leave U invariant, as the follow- 

k 0123456789 
ing counterexample shows. The permutation f given by f = C0432178956° 


belongs to U, as will be seen later on. Ì.f however does not belong 


to U since 5x1,£(8) = 5X5xXf(8) = 5 and Bxl, £(5) = 8x5xf(5) = 3x7 = 5. 
The transformation LL: frl, leaves both U and V invariant. This follows 
at once by substituting axx; axy etc. in xxf(y} # zxf(u} which gives 
ax(uxfl, (99) % ax(axfl, (ud). Since £1_ (0) = f(a) it is clear that Vo 

is not invariant for all L: 

The transformations R, and L are permutable since RL (£) = REL) 5 

= mfl, = Lr, £) 5 LR CD holds. 

Ít is possible to construct a transformation T, such that TU) = U: 
Define T, by T = Rd and let T_(£) = g, then 

go) = Piasórertad = 0, Moreover if Tg = h then g(x} = srt 


ze Get en en 
tno =beke erent Set ee 
ed beta and hence h = T . Thus TT =T is valid. 
axb b_a axb 
Furthermore To = f,‚, since fx} = f(Oxx}xf(O0} = f(x} provided 


that f{0} = 0 holds. The transformations Ts with a @D, working on U» 
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form therefore a group. The equivalence classes induced do however 
not always contain 10 permutations, as the following example will 
k n + 

show. Define f by f(6 } = $ and £(sde) = 8 de with d # O0, then 

=Ì k k et =k d_,=i 
fx) = f(5xx)xf (5) for X=ê 5 flexò Jxfle) = f(8 e}x(ô Ee} en 

=k _ =k k | j -k 

= sEedegsdr td = sE = rcedana for xe8te r(exsdext(e)  = 
£C5 Dnste = odd, = este). 


ij 


The relation xxf{y}) amyxf(x) is proved by treating the three cases 
x and y both low or both high and x and y in different classes, 
separately. 

ij st xe sd) = sid whereas st ecsh) = ê 
ii} stexf(sde) en dee ee Ee be deg but stexe (ste) ze sdesttd, = 
a glee and again i=j # 0. 
iii) sinecsde) = told = 65 
Thus fe U holds, but the detection of the twin errors is bad, since 


a 5 
sina(st) = 0 and SText(ade) = 8 ©. Hence 20 (i.e. 2(5)) of the 45 


JT4 but i=j #0 since x Á y. 


1 i4j 


TEE and stets jr sel Se de 


possible twin errors escape detection. This code will be met again 
in the next chapter. Clearly f & Vo holds and it will be seen in 4,4 


that in Vo the equivalence classes do contain 10 elements each. 


Meanwhile it is not clear how this transformation can be used in the 


search for U: The third transformation group is of a different 


nature. It is a subgroup of the automorphism group of S, consisting 
of the inner automorphisms derived from the elements of A, which 


is, as automorphism group of DÛ a subgroup of S. The fact, which 


; 
will be proved below, that eeen by elements of Á, leave U 

and V invariant means that A is contained in the normalizer of U and 
VV. The transformations are denoted by F with seÂ, and defined by 

PCE) z sta | for feS. The transformations form a group isomorphic 

with A, since F_F‚(£) = r (tet) = stft is Ì = (st)flst) | = F_) 
and since F, iS F, for s 4 t. 

From xxfly} # yxf(x} it follows that s(uxf(y)) # s(yxflx)} and as s 

is an automorphism, the inequality becomes after substituting ee 
for x and De for y: atie Á ete ek Hence Peu 


if feÙU. The same kind of reasoning proves that FD ev if feV. 
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Since the automorphisms leave 0 fixed, it follows that the transforma= 
tions F leave Vo and Vo invariant as well. 

The induced equivalence classes do not necessarily have 20 elements 
each. Ít can happen that PE = f, or what is the same, that sf = fs, 
All the automorphisms except those belonging to the subgroup Cs have 
just one fixed point different from 0, say s(a} = a. Therefore, if 

sf = fs then sf(a} = fs(a) = f(a} and hence f(a} = a. But then 

axf(0} = a = Oxf(a} holds, and thus f £ U: The permutations permutable 
with the subgroup Cs are of the form olf', where f' is a permutation 
which feaves each one of the high digits fixed, as can be readily 
verified (see 8). In fact the example given above is of this very form. 
These permutations give a poor detection of the twin errors. 

The latter property follows from the proof of theorem 4.2.0, since 
otto and ote' (hi) = hi, so that xxf(x} = lo for all x. The 
detection rate is therefore at most 40/45 and Se does not belong to V. 


For the search for Vo only a subgroup of A is useful, namely C, con= 


J 


sisting of the automorphisms o', with O0 < j < 3. With this group a 


factor 4 can be gained. Let U and V be defined by U = 


Oi Oi Oi 
ze Leleu, £(5) = il and Voi * (eltev‚, £(5) = i}. Now the search 
may be limited to say the classes Uo3 and Uog for, if £(5) e{1,2,3,4} 


then j can be chosen such that o®fo 75) = 3 and if £(5) e16,7,8,9} 
then for some j: o°fo *(5) = 8. Note that f(5) 4 5 if feU. 





The same program as in the preceding chapter can be used, except for a 
few changes. First of all the group operation has to be adapted to the 
dihedral group. Secondly the test has to be changed. The program is 


used twice i.e. once for U in which f(5} = 3 and the other time for 


03” 


Uos with f(5) = 8. The result of the 90 seconds search is that 


[u = 404 Ed AN = 441, so that Lul = 34040. There are no permuta- 


03) 
tions in U for which xxf(x) is nearly a permutation. Furthermore 

al = 72 and KA = 78 which makes [vl = 6000. It turns out that 

the transformations Ee divide V into 600 classes and that the transform= 


ations jn with s&A and R_ with aeD give a further subdivision of these 


classes. In V three permutations hs h and h3 can be chosen such 


that each f €V can be written as LR F _h, with a,beD; seÂ and 


b 
Ì el1,2,3}. The three permutations h, are given in the table below. 






je 





| i 
| 


The importance of this canonical representation will become clear 


ol ol iN 





in the next sections. 


4.5 The detection of the jump errors 





The requirement for the optimal detection of the jump transpositions 
and the jump twin errors is the same, as was shown in section 2 of thís 
chapter. Let g and g' be two permutations of V, not necessarily 
different. It will be said that g matches g' if xxyxgg' (ax) # zxyxgg' (2) 
holds for 424 out of the 450 triplets x,y,z with x,y,z&D and z > x. 
For a “good” code a chain of matching gE, s is needed in order to 
construct the fs recursively with Rn = ef, and ft E58 . Just as 

in chapter 3 it is advisable to make a catalogue of the matching 

pairs as a preparation for the construction of the chains. 

In view of the large number of possible pairs (36000000), other reason 
could be mentioned as well, it is worthwile to exploit the equivalence 
transformations of section 4.3. As a matter of fact it will turn out 
that a factor 20000 can be gained on the number of tests to be per” 
formed. To avoid unnecessarily complicated formulae the proof will 

be given in several steps. 

The transformations Ls R, and F_ with a,beD and seÂ satisfy the 


b 


following relations: O0} Lb, = Lia’ 1} RE, = Rat 2} PF, = EF 


= R 4) F = d 
de LE, pra ) FR, Rede 5) hete Fans 


The first four relations have been proved in 4.3. The remaining two 


£’ 


ae 
follow directly from the definitions, for FR CH) = s(r,Ös and 


for each xebD, sr, fs “Go ed sitte GOB e ste bet) and hence 
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4) is tn Also FP LL (EE) = s(fl, he and for each x &D sfl_s “() = 
= sflaxs. Re = sfs Aras me relation 5 follows at once. 
Denote the number of solutions of xxyxgg!'(x) = zxyxgg!'(z} with z > x 
and x,y,zeD, by N(g‚g'). The next step is to prove the relations: 
6) N(g,g') = N(R 8,8); 7) N(g,g') = N(g,L 8); 8) N(g,F e') = 
= N(F 18:89; 9} N(L eg) ee ae 18E, g'} where r is the inner 

s 


automorphism defined by rx) = axuxa 

Consider the equation xxyxgg'(x) = zxyxgg'(z) and multiply both sides 
from the right by b to prove the relation 6. Substitution of axx' for 

x and axz' for z followed by multiplication from the left of both sides 
by a gives relation 7. Now consider sansdie Ue) = ee 
and apply the automorphism he on both sides and substitute s(y'} for y, 
s{x'} for x and s(z'} for z and relation 8 results. The awkward factor 
y in the middle proved to be helpful in this situation, since y may be 
replaced by s(y)}. Finally the equation xxyxglaxg!(x)) = zxyxglaxg' (z}) 


-1 1 
can be altered into xxyxaxa xglax(g'(x}xa}xa )xa = 


ä 


-1 = -1 
zxyxaxa xglax(g' (z)xa}xa re or xxyxaxr gr(g'(x})xa) = 
= zxyxaxr Tgr(e' (z)xa) which gives relation 9 after substituting 
zt 
y'xa for the heipful y. By means of the relations above it is easy to 


prove that NL, R F f, ‚bo RF f, ) = N(F f_‚R ËN: 
1 1 
E ok t en Ee : t (dxa} 3 


The L, and the R, can be removed by 7, 3 and 6, giving 


N(L REE, nf da ) = NL, F Ë, ‚RF, 


N(L FE, RPL) = Ne Ee £ a Rr }. The E. can be moved over R 


by 1 and 4 and after that, 8 gives ae Ff; ‚RR F,f) = 


F B Application of 9 gives 


BNC Pak Eek f_). Finally A of 2 gives the 
lt Ee! PU =Ì 
t r t (dxa} 
desired equality. 
Consequently it is only necessary to test the 1800 pairs Fh,, en 
with s&Â, aeD and á,j ei1,2,3}. 
if one matching pair has been found for certain s', a’, i, j then all 
20000 pairs L_RF h,, LRF h, nn 5, Eldee With 
„abs i d t 
r(x}) = axxxa ©, for x En Be match. The result of the test program 


is that 10 pairs of the form Fb, 5 Ran match. Let the pairs be 


Ke 
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denoted by the quartet (s',‚i,j,a}. The 10 pairs are te Bree 
(0202 ,3,1,2); (o°,2,2,4); Vo PED Ke (o20,3,2,7); be as) 
(o*o°1,2,6); (ofoS,1,2,1); Bs: (o2,2,3,6) 

The diagram on the next page pictures the matching relations. It is 
now a simple matter to construct codes which give an optimal detection 
for the transpositions (100%), the twin errors (95.5%}), the jump 
transpositions (94.2%) and the jump twin errors (94.2%). A chain of 
permutations 8, can be made by following the arrows in the diagram 

and performing the operations as indicated along the lines. 


Let LR Fb, match L RF h. according to the quartet (s'‚i,j,a'}, then 


b ed EJ 1 
dxa = t(a'} and s = rts', with rx) = axxxa . By selecting d so that 


d = tla'}, it follows that a = 0 and r = De and hence s = ts. 


With the following scheme a chain can be forged easily: 


dl j quartet | t ta’) = d 


5 wirdenwdeween 


6035240) 


netrand 


aman ama eens weeen 


0 (5) 


wennen: 





Hi 
er] 


3 | 
(po RRCD, 9) (1) sk 


ER: 4 Ll 4 
(o ‚2,3,6) ef lo (6) = 5 


a 8 | 
21} (pe 9 ,1,2,6) p pl6) = 7 











Of special interest are the loops by he, and hg since they offer the 


possibility to form progressive codes which have all gE, Ss equal. So can 
_L 
$ 

Ss 


en 4 
h, be self=matching if s Le = 0 and bxa = 4, where r' is 


b & 


F LR 
Dn -1 =1 1 
defined by r'(x} = s(a}xxxs(a }. Hence s r' sx} = 


= -1 =1 
= Ss keta Ixslu}xsla} = a xxxa and it follows that a = 3 and b e= Ì, 


so that F L‚R‚h, is self matching for all s. Now LR‚h, = 


ss 13 3 1 2 
0123456789 0123456789 0123456789 - 
Ei LgR4 (0582637041 Là La 4973546802’ a (3519702468) = (03986215) (47). 
The loop by hg gives rise to the self matching permutations Eb, 
0123456789 


where L,h3 = („5961042358 


codes have all a period 8. None of these codes is phonetic error=proof, 


) = (07319854) (26)}. The resulting progressive 


but by choosing s = a the permutation (01589427}(36} is obtained which 


provides a code with a detection rate of 95,3% for the phonetic errors. 
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4.6 The detection of the phonetic errors 





The purpose of this section is to construct chains of matching permuta= 
tions, in the sense of section 4.5, with the property that the phonetic 
errors are detected too. The permutations E, occurring in the check 
equation ÏÌ ft, (a,) = C° which defines the code, are given recursively 


by f with É, arbitrarily chosen in S. The permutation Eri 


ket 5 ErÎr 
has to match 8, for k > 1, but 8, can be taken arbitrarily in U. The 
detecting condition EGE 4 (0) # EDE, GO may be written as 

ME E . pe “ 
xX gf, (0) % fg, Ge) for xeD. 

By putting p = fa) and q = £, (0) the condition becomes xx B, (ad £ 

a px ge, Cx) for xXED. The set of pairs p‚q such that the above condition 
is fulfilled, will be called the initial set of En The new values for 


p and q which are offered to Ey, ore ep) and ga) (or f (pì, 


k+1 
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fs). These pairs are said to form the terminal set of g,. The 


initial sets of h h_ and h…. as defined at the end of section 4.4, 


; 
can easily be De 5 En the condition. Let the initial set 
of h, be denoted by X,, then the sets X, turn out to be: X, = {o1,04 
06,07,12,13,14,19,21,27,28,38,39,43,45,51,56,63,67,68,72,73,75,18,80, 
87,90,91,92,94}; X = {01,03,09,17,18,20,25,28,32,38,43,46,49,56,57, 
59,62,65,67,71,72,78,84,86,87,89,90,91,92,93,94} and KX, = {02,04,08, 
10,16,17,19,21,23,24,28,35,37,42,47,51,53,58,62,64,69,70,72,75,79, 
81,82,91,93,95,96}. Thus LX, | = 30, |x,| = xs! = 31 holds. If 
B, * L Br, then the initial set of Er can be derived En the 
initial set of h The checking condition for Ei is ztje Lexa Ied vd 
x p'xth te (oe)xd which is equivalent with t kleadt Shit hiet) zé 
tNerpDxt Tent e), which after Pennen of x' for 
E tea) kee xs, Cexq')) gt en )xh, (x'}. Thus the pair 
eee oa! } has to an to k, and hanne for some pair p‚q 
from K p= t Ce and q = tE On } has to hold. The latter 
equations are hae with p' = e Eet) and q' = RD and the 
initial set of E is ce kt, De on terminal set of E, is formed by 
the pairs th‚t Elexp' }xd, me Si teva ded or th, (p)xd, th‚(a)xd. The 
terminal set of E, is therefore th, p)xd. 
ae if F u Re then E41 matches B, if Ear 7 LRE, h‚ with 
"Ede ze and t r s = u, where r is again the inner Binn 
defined by r{x) = B The phonetic errors are detected by this 
new link ifCth,(p)xd, th, (g)xaea ns(k) as is derived above. This 
relation Ee that for certain p'‚a) EX, it has to hold that 
DO za ruetp) } and th gE at in true. Substitution 
of rtu for s gives th, in ge GE Dxa 1) and the similar 
equation for q. subiet of e for t midde) gives, since 
d = t(e}xa en hp) = up }xe sn and similarly for q the equation 
ha) = u(q )xe en These conditions are only dependent on u and e, 
which depend only on the linking mode which is employed, to get from 
j to i, (see diagram of section 4.5). The conditions may be viewed 
as a directed graph K. The points of K are gi an CB: ‚{p), h, Cad), 


with i = 1, 2, 3 and (p‚qleX, « Now (G,ulplxe lade } is sehen 
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with Ci ‚h, (p) ‚h, (ad) pe Fk, matches Rh, and if pP‚OEX,- The 10 
matching modes thus give rise to 308 directed edges. A phonetic error- 
proof code has to be based on a chain of permutations Bi” which has to 
correspond with a path in the graph K. The construction is therefore 
brouent back to the problem of finding paths , preferable circuits, in 
the directed graph K. The circuits are interesting since they give 
infinite, though periodic, codes. The circuits, if they exist, can 

be found by the method described in 3.5. The twigs of K can be 

pruned off and after 10 prunings, as it turns out, a twig"free 
directed graph with 35 edges is left over. 

This proves that a circuit has to be present. Among these 35 edges 
there may be edges which have an initial vertex which is not the 
terminal vertex of any other edge. Edges like this may be called 

roots of the directed graph. The roots can be removed hy the same 
procedure, be it that the direction of the edges has to be reversed. 
After cutting off all the roots, there remains a graph with 18 edges 
and 16 vertices. A picture is given on page 99, The graph has a 

rich structure, since there are three basic circuits, A, B and C, 

such that tours can be organized, such that A, B and C are visited 

in an arbitrary order, with any multiplicity. The tours are thus 

in a one to one correspondence with the free semi group generated 

by A, B and C. The circuits B and C have each three edges, whereas 

À has 14 edges. Most attractive, because of their simplicity, are 

the codes based on one of these smaller circuits only. As an example 
one of these codes will be constructed. It is made of the matching 
pairs (F abo Reka): (F zaan (F Rn The vertices 

of C tes (1,6,2); (3,31). ihn of the initial sets 
of ho h, and he which correspond with the vertices are: (1,3); 
Clor €Spd. 

Let these pairs be denoted by (p;,a,) and let the matching pairs be 


‚Rh,}; (F_ h_‚R‚h.); (F_ h 


6 3 S, ee | s. zE h_}. The construction 


denoted by SE elo 


à 
of the g-chain may begin by taking an arbitrary permutation from U 


as 8. if this permutation in the canonical representation is 
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then the permutation f 


omen 


1 has to be chosen so that £, (1) si 


—_l 
Ee xt(p) and £, (0) = C xt(a), the other values of f may be 


chosen arbitrarily. For the matching pairs it holds that N(F, h_‚Reh) 


= 26 and by the rules 6 and 4 of section 4.5, it follows that 


en ze NE ee Ee NG een = 26, in an 


analogous way the other matching relations may be transformed. Putting 


R‚h, = h'; Rh, = hl and Rh, = hl it follows that 26 = NF, h‚ha) ze 


ee: 1 6 2 2 6 3 3 il 2 
ae ï î a î 1 s El 8. EE 1. 
Nn e Ne holds. Now putting B, = hj: 80 REL 
NE | ON Ks Bok h} and so on, a chain with 
3 SS, 2 4 555453 Ì 5 5551 S3°2 3 


the desired properties is found. With the aid of the relation 8 of 
section 4.5 it can be easily shown that Ne En) = 26, In the table 


below the first 13 permutations f are given. 





a add Sl 





This particular code has a period of 12. 

in general a code can be constructed as follows: First select a route 
in the linking-graph K. Second take 8, in accordance with the route 
selected, say 8, L RF h., where the j is fixed by the route 


ec dt j 
selected. Then f, is free in S, provided that £, 1) and £, (0) are 
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suitable for 8, and the selected route. Íf the second g is in the 

canonical representation L RF h, and if the matching pair given by 
the route selection is Ren: then dxa = t(e} and s = rtu has to 
hold. Hence only b can be chosen freely. The same holds for each of 
the following permutations Er For each prechosen route in K there 


are 8! possible choices for f 2000 for B, and 10 for each following 


Er Note that the EE may be used at the ends of the chain. 
All these codes detect all single errors, all transpositions and all 
phonetic errors. Of the twin errors 95.5% is detected and of the 

jump transpositions and jump twin errors 94.2% is detected. Ín all 
classes the detection is optimal for codes defined by a check equation 


Ï £, (a) = Cc in the dihedral group D,. 


02 





Historicaily this chapter should have preceded the foregoing one. In it, 
a code is explained, which was the first, pure decimal, one to detect all 
single errors and al} transpositions, Like the I.B.M. code mentioned in 
2.3, it was designed without regard for the jump errors or the twin 
errors. Ït was sheer luck that the first one did detect more than 50% 
for these types. In 5.3. a generalization is given which scores nearly 
90% in said classes. It is very remarkable that the first bi-quinary 
code is, at the same time, a code based on the dihedral group, whereas 
the generaiization is net interpretable as such. As a matter of fact, 
the present author tried in 1955 to design a transposition=procf code 
based on the dihedral group, but without succes. Instead the bieguinary 
code of 5.1 was found. 

Though the bi=quinary code met the requirements, set at that time, it 
was considered to be of mainiy theoretical interest, since the come 
plexity of the check eguattons did not encourage the design cf a 
verifier. For @ switching circuit, which performs the checking,see 51. 
The circuit is incorporated in a larger switching system used in the 
iibrary of the University of fechnology at Delft (see 52), 

Later on, Á, Benard gave an interpretation of the code based on the 
addition modulo 10. The weights used in the check equation are de- 
pendent on the value of the code word itself. It is therefore 2 none 
iinear code and for that reason the nonvexistence proof of 3.3 does not 
apply. The generalization of the code admitts an analogoug Anter= 


pretation. 


9.1. The first bi=quinary code. 





The set {0,1,2,3,4,5,6,7,8,9} is mapped on the Cartesian product of the 
sets {0,1,2,3,4} and {0,1} . The five element zet will be denoted 

by V and the set with two elementa by W. The set of the ten decimalg 

is called D. Each decimal digit x is thug mapped on a pair (v‚w} with 
vaV and wal. The mapping is quite arbitrary, but it may be advantageous 
to use & natural way, like vex (mod 5} and wex (mod 2). In this chapter 


the digits 1,2,3,4,5 will be called low and the other ones high. Thig 
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is done in accordance with the conventional telephone switching techniques, 
in which the OQO is represented by 10 pulses and the other digits by the 
number of pulises indicated by that digit. So, low means less than 6 
pulses and high means more than 5 pulses. If a is mapped on x,y}, then 
it is convenient to have a notation for this relation. Therefore two 
funetions, denoted by v and w, are introduced. The functions map D onto 
V and W respectively by the definition vla)=x and wla)=y. Throughout 
this chapter only mappings are used such that {x | w()=0} is a complete 
set of representatives modulo 5. In this section w(x})=0 holds for the 
low digits and w(x)=l for the high ones. The sets V and W can be made 
into groups by defining an addition. For V this addition is the addition 
modulo 5 and for W it is the addition modulo 2. Hence (V‚ rel. and 
(W‚t)=C., since both groups are cyclic. As usual in mathematics, it is 
not thought necessary to employ different signs for the various 
additions. For untrained readers and computers this is sometimes con= 


fusing. Let a a be a word of D and let 6 be defined by the 


122 Ee 
bg Aeg moed 2) and t 


recursion tt. =O, hence aka The first 


j+1 0) 
bi=guinary code C consists of those code words satisfying: koe in Co 
Be 3 D Re 
and (1) (vla) vla.) 2wla, ))t{ 1) (vla) vla) 2wlag))t. =0 in Cs 


The terms wla,) occuring in the latter equation, which are O or Ì in W, 
have to be interpreted as O and 1 in V. Strictly speaking a mapping Ò 

had to be introduced which maps W into V, such that b(O)=0 and Piel. 
In the formula, Dawa, should then have been used, instead of wla,). 


The following lemmata can be proved: 


5.1.1 The code C is E‚-proof. 


1 


5.1.2 c|=10"7 ‚, Í.e. the code can be considered as a code with (n=1) 





information digits and 1 check digit. 
5.1.3 The code C is transposition=proof. 
5.1.4 The code C is phonetic error=proof. 
re 5.1.1 The change of any digit a, may imply a change of wa), in 
ghich case the first check equation ceases to be valid. Otherwise the 
quinary value vla) has to change, but this will violate the second 
eguation, since all the values wla,) and En are unaffected by the change. 
re 5.1.2 For each of the on”! choices of a, with ind, it is possible 


to find one and only one digit a so that both equations are valid. 


Me 


104 

n 
Through the first equation, ) wa, )=0, the value of wa) is fixed. 
Áfter that, all the functions” t, have a known value too and hence vla) 
can be solved from the second equation. The pair vla, ),wla,) fixes the 
digit a: 
re 9.Ì.3 The formal proof of this lemma is labourious since many different 
cases are considered. It will be left outhere,since the property Ee 
be proved later on. It is of interest however to explain the clue of the 
strange second check equation, which contains two parts, namely: 
Lt (a) and -2)+w(a,;_,)- The first part isaweighted sum modulo 5, 
with weights dependent on the wevalues of the digits. The second part 
is solely dependent on these wevalues; it will be called the binary 
function of the second check equation. 
Now the detection of the transpositions in all the bi=quinary codes of 
this chapter, is based on the following principles: 
The first equation, as a straight sum modulo 2, will never detect a 
transposition. Let a and b be the transposed digits, then: 
1) If wla)=w(b},and therefore v(a)-v(b}4O, then the first part of 
the second equation, Gava), will change value. The binary function 
can of course not change. 
2) If wla)l#w(b), and therefore v(a)-v(b) may take all 5 values of V, 
then the binary function changes value. The first part of the second 
equation remains constant in this case. This is necessary, since if 
it were allowed to change, then for one of the 5 values of vla}=v(b) 
the change of the binary function would be compensated. 
It is left to the reader to check that the present equations satisfy 
these principles. 
It will be clear from the considerations above, that for the binary 
function many other possibilities exist, since the only requirement 
is that it changes value if two adjacent digits with different 
Wevalue are interchanged. 
re 5.1.4 Since w(O)=1 and w(1}=0 holds, the phonetic error Ix>x0, will 
spoil the first equation. Also for this argument ‘t is good to take O 
as a high digit. 
It will turn out in the course of this chapter, that the twin error 


detection rate is 5/9 and the jump error detection rare will appear 
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to be 2/3. 





It is possible to define the same code recursively. Let Co be chosen 
so that wle )=0 and vle)=0, (hence e5 in the convention of this 


chapter). Define c and c by: wie 


2k+1 2k kwle 


zr twlag): 


3 2k 


weg) 


j=v le.) vla, ); 


wi rn 


jewle twan.) and vle, 
) 


wie 
2k+1 
Gora) Yea) tCD) (v{ 


Cak+1 k 


v )-2wla, he 


doka +1 


The code C consits of those words, for which 6 holds. Fram these 
recursive formulae, which are immediately clear from the equations 

ef the preceding section, it follows that a Latin staircase is possible 
(see chapter Ì). 


The following two Latin squares are applied alternatively: 


lo 






al esi ale || ND 








3 


GEGEOEEEE: 
| | 
al ej= 


63) Nl ml el el ola O 


4e) 





Ö 
Ei 


The entries of the two tables are written in a somewhat unusual order 
to show the similarity with the multiplication table of the dihedral group. 
In fact, after interchanging O and 5 both in the entries and in the body 
of the table,two tables are obtained, which are column permutations of 
the Cayley table of D, on page 83. Let the dihedral group after the in= 
| | 5’ hence in Ds Be Oze and 

j=8®; JB e ‚ for j=1,2,3,4. The recurring relations become: 


terchange of OQ and 5, be denoted by D 
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c mk 6 f (a and = Xf : i — 
ok+1 Cor” { ok+1) n Bn Et ) in De, where using the conven 


tions of chapter 4, £C5 Dag sr, (8 Tedes" Ve and £(6 )=8 sE (6 edz6 Pe. 
jj Ì ad | 

Th =f_ f ee 0 

Jd 


Sh rd +2 „li ei 
kaa ze f z=Â 


_J+3 | 
E 


=| 
Ef f (ST e}=ê 


1 2 


Hence both 8, and g, are permutations of the type given in the example 


2 
of section 4.3,which gives rise to a transposition=proof code. The code 


(a, )=5 in D', with h =f, and h, =f 


an th fine À 
c d us also be defined by TI h, k 5 ok+1 ji ok“ fo 


k=l 
From this interpretation of the code it follows immediately that the 


detection rate of the twin errors is 5/9, see p. 91. The jump error 
detection rate is also easily found by the method of 4.2. In order to 
apply this method the distribution of the pairs Ges 8,8, 0) and Ge, 88, GO) 
over the classes de should be krown. Now 8, and B, are each others 
inverse, so that all the pairs fall in Ao and Az that is 5 in each 
class. Moreover, the difference of the exponent of 8 in x and B, 8,5) 

is always O. Hence there are 20 pairs of x,z which cause 5 undetected 

jump errors each. To the complementary classes belong 25 pairs x,Z 

each giving 2 undetected jump errors, so that oniy 300 of the 450 possible 
jump errors will be detected. 

Ít is the tragedy of codes like the one above, that even though the error 
type with the low detection rate, has a small frequency, it may occur 

that in the set of undetected errors, the given type is dominant. The 
result is that the code looks very bad, giving the impression that a 

major class of errors has been overlooked. 

By taking another mapping of D on VxW, A. Benard gave an elegant inter= 
pretation of the same code. 

Let the mapping be defined by vlx}=xlmod 5) and wixhaxlmod 2).Let 
ede be a code word. Now Benard remarks that the odd digits sepa= 
rate the even digits in, possibiy empty, runs. The runs, including the 
empty ones can be numbered from the left to the right A run with an 
even serial number, will be called an even run. let the odd digits of 
the code word be numbered from the ieft to the right and let o(j) denote 
the j=th odd digit. Let furthermore the even digits of the code word 
also be numbered from the left to the right and let e(j) denote the 
j=th even digit. Now put yen tounes, and T-neli=s, and let the 
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number of even digits, occurring in even runs, be denoted by K. The code C 

is defined as the set of code words satisfying: SS 2K (mod 10). This 

code is essentially the same as the one defined in 5.1. The check equation 

S STK taken modulo 2 is the same as the first equation of 5.1 and 

taken modulo 5 the equation becomes: VD vo VCD ve (j))=2K (mod B 
Now let a, be the j-th odd digit, then a =o(j) and t‚=j (mod 2). 


(mod 2) and if i=2i" then j=t 


If i=2i Ned th dz 
1 en J tos ri 2 


vat +wla,)= 
1 1 


2i'-1 


to, rf i(mod 2). 


Hence for the odd digits the coefficient of vla) is the same in the 
modulo 5 equation of both interpretations. On the other hand, if a, is 
the j=th even digit, then a, =e(lj) and j=i=t, (mod 2). Hence for i=2i'=1 


it holds that ja 1+t _j “rod 2} and for i=2i' it follows that 


21° 


j=t at +wla, jet 
En 


coefficients of vla) are the same in both modulo 5 equations. Only the 


„(mod 2). Hence also for the even digits the 


binary function of the Benard equation is different, but 2K does have 

the property that it changes value if two adjacent digits with different 
parity are interchanged. It follows that the code defined in the Benard 
fashion has the same detection rate for the transpositions and the 
phonetic errors. This can also very easily be proved directly, using the 
principles of page 104.For interchanging adjacent digits with the same 
parity, is detected since either So or Se but not 2K changes value. Inter 
changing digits with different parity only changes 2K, since the even and the 
odd digits retain their serial number, but one of the even digits comes 

in a run of a different parity. For the proof that the single errors 

are detected, it is sufficient to observe that the number of odd digits 

in a valid code word is always even. A parity changing single error 
disturbs this rule and a non=parity=changing error is detected by 

S or A, since alii the coefficients are unaffected by the error. 

The binary function, 2K, in the Benard variant is a less fortunate choice, 
since the twin errors are only detected if even twins from an even run 

are changed into odd twins. This gives a twin error detection rate of 
about (25/45)/2=27.8%. The jump error detection is independent of the 


binary function.lt should be noted that parity may be read as Wevalue. 


The purpose of generalizing a formula is often to create the possibility 
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of selecting another specialization, which has more desirable properties 
than the original formula, In other words after creating more freedom 

of choice the selection of a better code becomes feasable. It is a 
delicate question what a proper generalization is in this respect. The 
Latin staircase method with arbitrary Latin squares, for instance, is 
certainly a generalization, but it is of little help because there 

is no easy way to test the merits of the resulting code combined with 
an overwhelming number of possibilities. A generalization should preserve 
some basic idea. Finding and formulating the basic idea of a method 

is essential for finding a generalization. The clue of the present code 
is thought to be the bi=quinary representation of the decimals in com= 
bination with the peculiar structure of the second check equation. The 
success hinges on the fact that a quinary transposition=proof code is 
possible. A weighted quinary code, defined by Ju, 2,=0 (mod 5), is 
transposition=proof if the adjacent weights are different. These weights 
may depend on the binary components of the digits. The binary word 

wla, )wla)...wla ) will be called the binary key of the decimal code 
word Aegae The binary key should be parity checked and therefore 
it detects always the single errors which change the binary key. Those 
single errors which leave the binary key invariant are to be detected 
by the quinary check equation Ju, va, )=B (mod 5), where B is the key 
dependent binary function. This implies that none of the coefficients 
u, may vanish. The principles of the transposition detection were 

(see page104): that the left hand side of the quinary equation remained 
invariant if the binary key changed and was changed if the binary key 
remained invariant. The latter property is fulfilled as soon as the 
adjacent weights are different. The invariance under key changing 
transpositions is a much more severe requirement for the key dependent 
weights u: Let b, and 9 be two keys, which are equal on all places 


Ì 


but the first two and let b, start with Ol and b, with 10, then 


u.b,J=u. (b,)} for i >2 has to hold. Furthermore u. (b, }=u{b.) and 
Ed 1 2 1 1 2 2 
u, (P)=u, (bh) are also necessary conditions. For each key b, it has 


to hold that u, (b)#u, (b}, if b has equal bits on the i-th and the 


1 
Ci+i}-th position. 
Fhe obvious improvement strived for is a better detection of the twin 


errors and the jump transpositions. A cumbersome analysis reveals that 
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100% detection cannot be achieved in either category by this method. The 
optimal result can be explained best by reconsidering the first bi=quinary 
code, The coefficients eeccurringin the second check equation are exclu- 
sively +1 or =l, so that an alternating quinary check is employed, Ít is 
however well-known that equations like J2*a,=0 (mod Bj yield much better 
codes for pure quinary code words. It is therefore obvious to try to ex- 
ploit this circumstance. A decimal code can be defined as follows: 
Define, using the same notation as in 5.2, Ls and in by Nn y2do) and 
En )2e(j) and let B be a binary function, which is modulo 5 sensitive 
for the transposition of digits with unequal parity (or W-value). The 
code C is defined as the set of words satisfying: Jw(a,)=0 (mod 2} and 
T*T B (mod 5}. If w is defined by wlxd=x (mod 2) then the two equations 
may be combined into one by setting De yoli), giving T, tT =B (mod 10), 
where B'=B if B is even and B'=B+5 if B is odd. The equation modulo 5, 
just as the second equation of the first bi-quinary code, has the pro= 
perty that the interchange of adjacent digits, with different parity, 
does not change the left hand side, since there is no change in the 
serial number of the odd or even digits. The binary function B however 
will change. On the other hand, if two digits with the same parity are 
interchanged, then the function B will not change, whereas one of the 
sums Te or Te will. Hence each transposition will disturb the second check 
equation. The advantage of the generalization is that the non=parity= 
changing twin errors are always detected. This follows at once from: 
odarai taai zake  (3a')e2da'r20 Tat and a-a'{5. The parity=changing twin 
errors disturb a lot in the equation, since all odd and all even digits, 
which follow the error, get an other serial number. Also the function B 
mey change value. For each a there are five possible values for a’, which 
are all different modulo 5, hence for each a there is just one a° which 
compensates whatever changes oecurred through the parity change. There are 
never two values for a° which do so, since otherwise the singie error 
which interchanges these two a''s would not be detected. So 5 of the 45 
twin errors per position, are undetected, giving a rate of 8/9 


For the jump errors there are several cases to be considered. 


ij The interchanged digits and the digit in between, all have the same 
| PO edad te ne 
parity. This type is detected, since 2Ja+2) b2° b4-2 a is equivalent 


wi th 2S(3a)42 (3D) and since a=b{#0. 
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ij} The interchanged digits have the same parity, which is different 
from the parity of the middle one. These errors are detected since 
dart togedn+attta holds, as a{b. 

iii) The interchanged errors have a different parity. Now 5 out of the 
25 possible combinations are undetected, since for each a there are 5 
different values for b possible. Only one of these values will leave the 
second equation true. The nett result is that the jump transposition 
detection rate is 8/9. The jump twin errors are more slippery. Suppose 
that aba becomes cbc somewhere in the code word. Consider three cases, 
i) wlaj=w(b}=wle). Then the error is not detected because 
2Ja+2t*2a=2)(5a)=0 (mod 5). 

ii) wla}=w(c}#w(b). This error is detected because 2iarat taai (za) 
2Irzee2der2dte and since afc (mod 5). 

iii) wla)gw(c). Then again Ì of the 5 possible values of c gives a 
valid check equation. 

Hence 15 of the 45 errors will be undetected, thus yielding a detection 
rate of 2/3. This latter rate is not easily improved upon. 

The choice of the function B is less important in this generalized 
code, but for some technical implementations a skrewed choice may be 

of influence. To make this clear an example is sketched. Let Se be 

the number of even digits preceding the j=th digit of a code word and 


let o be the number of odd digits preceding the j=th one. Hence 


Os &. 
e +o.=zj=i. Now B= ) (2 Jz 0 may be taken and the second check 
RDE wa #)=1 
EA Tô 
Í Î + 2 =B Oo). t 
equation may be written as wa D=1? a, WE )=0 a, (mod 5). Le 


v and w again be defined by w(x}=0 for x&£{1,2,3,4,5} and else wix)=l, 
and v{x}=x (mod 5). Suppose that the digits are fed into a verifier, 

as pulse trains according to the convention that the digit x is repre= 
sented by a train with x pulses, with the understanding that the O is 
counted for 10. Without going into the details, it may be pointed cut 
that the high digits are recognized only after the 6-th pulse is 
received. It can be so arranged that the pulses of each train are 
treated in the beginning according to the “even mode! and only after 


receiving 6 pulses the treatment is changed into the “odd mode’. The 
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result is that the even (that is low) digits are counted correctly, 
but that the high digits gave 6 pulses in the wrong mode, whereas A 
pulses are treated correctly. This difference modulo 5 is precisely 
needed for the binary function B. The peculiar form of the function 
used in 5.1 also comes from technical considerations. 
Though this generalized bi-quinary code improves upon the one of 5.1, 
it is still of a lower standard than the codes of chapter 4. Mathemati= 
cally it has a very interesting property, which alone would be a suffi= 
cient reason, or excuse, for mentioning it. It is the only code so far 
which is not of the Latin staircase type. In other words, there is no 
recursive definition like ed; Ce): 
The recursion can be given with the aid of an auxiliary binary quantity 
zE From a technical point of view this means that an extra memory is 


needed. 
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